Can someone please explain to me the difference between a primary key, foreign key, clustered index, natural key, and surrogate key?

589 views

Learning SQL right now and the definitions for these terms are a little confusing. Can someone please explain them with an example? Thank you.

In: 15

18 Answers

Anonymous 0 Comments

let’s say you have a table and you’re going to use it to keep track of your new friends that you just met. you each get a number, and that’s your ID. the table will have an ID column. you also have a name column. now let’s say instead of names, you all have stickers on your shirts that have your ID number on them. you see one new friend has 3 as their ID. you go to the table but you see two rows with an ID of 3, and both have different names. that makes the table useless, right? that’s what primary keys are for. (unique keys too, they’re basically the same thing) they require that the values in that column are unique. meaning no copies/duplicates. so in that case, each person needs a unique ID that nobody else has.

now let’s say you all have pets. (dogs, cats, etc) you could add a pets column to your first table, but if you have more than one pet, you would need more than one row, and that just won’t work. you could add extra columns for more pets, but that’s not smart, because you’re limited by the number of columns you have, and will always have to keep adding more if people keep getting more pet and more pets. so instead, you create a pets table. but how do you keep track of which pets are for which person? their ID! so you insert a row for each pet. and you make a column called owner_id, and put in the id of the owner from the first table. now in this table, you don’t want that column to be primary key, because if you have more than one pet, you will have that person’s id in that column more than once. so now your pets are all in the table and you can identify which pets go to which owner. that ties the two tables together, and this is the whole point of a relational database. you can write a join query to get owner names and pet names with one single query.

now one of your friends moves away. so you delete that friend from the friends table. later on, you decide to add a new column to the pets table called type. you’ll put things like bird, dog, cat, etc into that column. when you start going through row by row updating that column, you look up the owner using the ID column, and call the owner to ask them. when you get to one pet, you can’t find the ID in the owner_id column in the friends table. what the heck? then you realize, that’s your friend that moved away. you deleted them from the friends table, so that row is now an orphaned row (that’s the technical term). a foreign key is a type of constraint you put on that column in the pets table to tie it to the ID column in the parent table. it won’t let you delete a friend from the friends table without deleting the associated pet rows from the pet table first. that way you don’t leave orphaned rows. those can break a lot of apps.

those are 99% of what you’ll hear about/use. surrogates you’ll use but i never hear anybody even use that word. ok, so your pets table. you have that owner_id column. let’s say you have two pets. that means you’ll have two rows in the pet table with your id in the owner_id column. if you want to update one of your pet’s row with an update statement, how do you do it? if you do “where owner_id=1”, it will update both rows. this is when you want a separate ID column that’s unique to each pet row. so usually people just add an “id int primary key auto_increment” column. they usually start at 0 and each row you insert the number in that column automatically goes up by 1. that way each row has its own unique id, and you can update a pet’s row directly without updating the other pets if they have the same owner. that’s all. it’s common that you’ll have an id column in every table. you’ll often see the column that contains the child rows (like our pets table), will have a column that’s the name of the other table + the word id, and use that as the foreign keycolumn. so if you had a bosses table, and an employees table, you’d have a column in the employees table called “boss_id” that would match the id column in the bosses table. in our example, if we did it the more common way, instead of owner_id, it would be friend_id. or we’d rename the friends table to owners, and keep our current owner_id column.

but anyway, i hope that makes things clear

You are viewing 1 out of 18 answers, click here to view all answers.