Best layout for a particular MySQL data field

While I’m not shoveling snow (over 109" so far this year) I’ve been teaching myself PHP & MySQL (I have a yutz for a teacher). While laying out a MySQL tables (Shoes), I’m wondering what is the best way to handle the field I’m calling ‘Size?’ Size in this case incorporates both length and width. Not to get too detailed, but sizes can vary by manufacturer. For example a men’s 10 D (length, width) could also be called a 10 M (medium) in US measurements, or a 42 in some European measurements.

The inventory of vintage shoes for this site has some styles with only one pair (relatively simple problem to solve). Others have a dozen or so pair in different sizes. Some have two or more pair of the same size. My concern is how best to handle these later examples in the database schema.

I could simply create a Size field as a type VARCHAR, then list each size (e.g. 7AA, 9B, 9.5C). This would be fine for styles with just a few sizes. For those with a dozen sizes, it might become cumbersome to display on screen. From a user perspective, if they do a search based on sizes available (rather than say color or style), then the screen presentation wouldn’t be much of a problem. Looking down the road, I’d also like to automate the deletion of the size from available inventory when the sale is made.

Any thoughts or suggestions would be greatly appreciated.


dynamo mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

On Jan 13, 2009, at 7:12 PM, James Brouwer wrote:

I could simply create a Size field as a type VARCHAR, then list each
size (e.g. 7AA, 9B, 9.5C). This would be fine for styles with just a
few sizes. For those with a dozen sizes, it might become cumbersome
to display on screen. From a user perspective, if they do a search
based on sizes available (rather than say color or style), then the
screen presentation wouldn’t be much of a problem. Looking down the
road, I’d also like to automate the deletion of the size from
available inventory when the sale is made.

I would model this with two separate properties: size and width. Size
would be a float, so you could do half-sizes. Width would be a
varchar, and would allow null for shoes that don’t specify their width.

This would allow you to set multiple picking lists in your search
interface, and only show styles that have matching sizes. This is one
of my favorite features at Zappo’s, which is pretty much the only
place I get shoes these days.

As to the inventory problem, I would add an attribute called units or
stock, which would be an int(5) or whatever makes the most sense for
your inventory numbers. Each time you sell one of these, you would
decrement the stock as a part of the sale process and update the row.
In the search and index queries, you would add ’ AND stock > 0 ’ in
the WHERE side of the SQL. That way you’ll never show something you
can’t sell.

To make a picker for sizes and widths, try this trick:

‘SELECT DISTINCT(size) AS s FROM Shoes WHERE stock > 0 ORDER BY s ASC’

Now, iterate over the result set to build your picker:

$p = '<select size="1" name="size">';
$p .= '<option label="Any size" value="">Any size</option>';
while($obj = mysql_fetch_object($result)){
	$s = $obj->s;
	$p .= '<option label="' . $s . '" value="' . $s . '">' . $s . '</ 
option>';
}
$p .= '</select>';

Do the same for width, and your search interface is built for you –
and you will always only show sizes that have shoes to go with.

Walter


dynamo mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

Walter,

Great approach, thank you. It makes perfect sense, though it raises another table/field layout question.

We’re planning on having roughly 4 photos per shoe style (here a manufacturers style would be distinctive for each color and material, as well as their other features). For styles in which we happen to have multiple pairs, we would only take representative photos of one pair.

I’m trying to minimize the workload of entering redundant data into the Shoes database as well as avoiding unnecessary null fields. Clearly, entering the same photos multiple times for different sizes in the same table doesn’t make sense. Would the proper way to go be to set up an Inventory table with a primary key such as SKU# to hold each pair of shoes by size (with length and width fields) using a foreign key such as Mfg#? The related Descriptions table would hold Mfg# as the primary key and have the corresponding fields that would describe a particular style (including all of the photos). Does that make sense?

Taking it a step further, could we reduce overall storage size (and redundant data entry) by developing a number of tables to lay out all of the other redundant information such as Manufacturer Name, Color, Material, Style, Feature, etc.? Or would such a schema of pulling in multiple tables wind up just slowing server response time down (trading off storage space reduction for performance hits)?


dynamo mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options

On Jan 14, 2009, at 1:39 PM, James Brouwer wrote:

Walter,

Great approach, thank you. It makes perfect sense, though it raises
another table/field layout question.

We’re planning on having roughly 4 photos per shoe style (here a
manufacturers style would be distinctive for each color and
material, as well as their other features). For styles in which we
happen to have multiple pairs, we would only take representative
photos of one pair.

I’m trying to minimize the workload of entering redundant data into
the Shoes database as well as avoiding unnecessary null fields.
Clearly, entering the same photos multiple times for different sizes
in the same table doesn’t make sense. Would the proper way to go be
to set up an Inventory table with a primary key such as
SKU# to hold each pair of shoes by size (with length and
width fields) using a foreign key such as Mfg#? The related
Descriptions table would hold Mfg# as the primary key
and have the corresponding fields that would describe a particular
style (including all of the photos). Does that make sense?

Add a table called photos. Give each photo an id, and in each shoe
that has a photo, have a matching photos_id.

As far as the interface for adding product goes, I would make it
possible to add multiple sizes in one pass. Make an interface with all
the fields for the shoe – name, description, materials, whatever.
Then add one field called size (square brackets are important here)
and another called width and another called stock. Use a
JavaScript to add more copies of these fields as needed.

When your form is submitted, all of those sizes and widths are going
to appear as arrays, so you can use that as a way to generate shoes in
the database without having to enter each size over and over. (The
following example uses the MyActiveRecord object-relational mapper for
brevity, you can adjust this to match your own method of creating and
saving rows in the database.)

foreach($_POST['size'] as $k=>$size){
	$shoe = MyActiveRecord::Create('Shoes');
	$shoe->populate($_POST);
	$shoe->size = $size;
	$shoe->width = $_POST['width'][$k];
	$shoe->stock = $_POST['stock'][$k];
	$shoe->save();
}

So now, with one post, you have added as many different sizes as you
want, each as a separate row, and each with its own stock total. You
could have added 20 x 9.5 D, 15 x 9 E, whatever. It will be one post.
All of these shoes will have exactly the same properties set for name
and description and whatever else. If your form included an upload for
the photo, you could create the photo, get its id in a variable called
$photo_id, and then set that as the photos_id in each shoe record,
just by adding $shoe->photos_id = $photo_id; before the call to
save();

Taking it a step further, could we reduce overall storage size (and
redundant data entry) by developing a number of tables to lay out
all of the other redundant information such as Manufacturer Name,
Color, Material, Style, Feature, etc.? Or would such a schema of
pulling in multiple tables wind up just slowing server response time
down (trading off storage space reduction for performance hits)?

In my experience, there’s a trade-off to be had between data model
complexity and storage size. Unless you are pasting the Magna Carta in
each description field or you have two dozen sizes of each show, the
repetition is not going to be an appreciable drag on your system.
Having to calculate a JOIN just to show a record is going to be a
measurable hit on performance every time someone visits a page. It’s
something you will have to test, but my sense is that if you design a
usable admin interface (effectively hiding the repetition from
yourself) you can benefit from the database server not having to do
something like this on every single hit:

SELECT shoes.* photos.* sizes.* whatever.* FROM shoes, photos, sizes,
whatever WHERE 1 AND shoes.sku = photos.sku AND shoes.sku = sizes.sku
AND shoes.sku = whatever.sku ORDER BY sizes.size ASC, sizes.width
ASC …

when you could just

SELECT shoes.* from shoes WHERE 1 ORDER BY size ASC, width ASC

muuuuuch faster.

Walter


dynamo mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options