Create Your Own 3-D Physical Search Engine
Here you will investigate search engines by exploring
their inner parts. You will build a physical search engine and visually understand how it works on
the tiny 3-D Food! document collection.
3-D Physical Search Engine
You need the following tools for this project:
- wooden stick - 10 inches in length. Make tick marks every inch.
- 10 pipe cleaners: 2 red, 1 blue, and 7 green - at least 10 inches in
- piece of paper with Quadrant 1 of the xy-plane marked on it. You can make this.
Take a 10 inch by 10 inch piece of paper. Starting from the bottom
left-hand corner, move up along the left edge of the paper to create the
y-axis, making a tick mark every inch. Do the same along the bottom edge,
creating the x-axis.
- 2 styrofoam blocks - about 2 inches by 2 inches.
- Orient yourself with the piece of paper that has the
xy-plane marked on it. Practice by plotting the vector
Note the scale of the axes: one tick mark represents 0.1 units. The first element of the vector represents the distance on the x-axis, and the
second element represents the distance on the y-axis. Now measure in
inches the length of the line connecting the origin and the point (0.5, 0.8).
Remember the scale. Note
that the length of this 2-D vector is equal to the square root of the sum of
squares of each element in the vector. That is, for the 2-D vector
the length is
- Now use the 2-D xy-plane and create a 3-D xyz-space. Place the edge
of the xy-plane over the edge of the desk.
Adhere the styrofoam block to the corner of the desk with tape. Make sure
the top of the block sits about 1 inch above the face of the desk.
Now tape the wooden stick to the styrofoam block so that the
stick's base touches the origin and the stick stands vertical with the
numbers facing the xy-plane. This will be the z-axis. See the Figure 1.
Figure 1. Setup of desk
using the 3-D space by plotting the 3-D vectors and
The first element represents distance on the x-axis, the second
represents distance on the y-axis, and the third represents
distance on the z-axis.
Notice that the length of each 3-D vectors is, as with the 2-D vectors, the
square root of the sum of squares of the elements in the vector. That is,
for the 3-D vector
the length is
Find the lengths of your practice vectors.
- Now return to the small Food! magazine document collection
example. The term-by-document matrix A for this example is
Recall that each column of this matrix represents a document vector
corresponding to a particular article. Find the length of each document
vector using the formula from step 2. Now cut a green pipe cleaner
1 inch longer than the
length of each document vector and, using the stickers, label each pipe cleaner
according to the document vector it represents.
- Plot each document vector on your 3-D space system by inserting one end
of the pipe cleaner 1 inch into the styrofoam block at the origin at the proper
angle so that the other end of the pipe cleaner touches its 3-D point.
- Once all the document vectors have been plotted, consider the
query vector. Query vector 1 from this Food! example is
Find the length of this query vector and cut a red pipe cleaner 1 inch longer
length. Plot this query vector in your 3-D space.
- Visually inspect your 3-D graph and see if you can answer the question
"Which document vectors (green pipe cleaners) are closest to this
query vector (red pipe cleaner)?"
Use your protractor to measure an angle of 50°.
Fold the blue pipe cleaner in half, opening the legs so that the angle between
them is 50°. Take the "V" end of the folded blue pipe cleaner and
hold it at the origin. Take one leg of this "V" and fasten it around the
red query pipe cleaner. Make sure the legs of the blue pipe cleaner stay
Rotate the free, unattached leg of the blue pipe cleaner around the red
pipe cleaner, sweeping out an invisible cone.
Figure 2 shows a visible representation of the invisible
cone. Only the red and blue pipe cleaners, not the green ones, are depicted,
to simplify the picture.
Figure 2: Cone created by sweeping blue pipe cleaner around red pipe cleaner
vectors fall in this cone? Notice that an angle of 50° has
a cosine measure of 0.643. This means that all document
vectors whose cosines of their angles with the query vector are greater than
0.643 will be retrieved
by your physical 3-D search engine system. Suppose you want to be more
lenient and return to the user all documents with a relevancy score
above 0.3. What angle should you now sweep out? [You desire the angle θ such that cos(θ) = 0.3, which means
θ = cos-1(0.3) = arccos(0.3) = 72.54°.] Now which document vectors would your
physical 3-D search engine return to the user?
- Repeat these steps using query 2 from the Food! example, Since the document vectors remain the same, you can start at step 5. Be sure
to remove the red pipe cleaner for query vector 1 and cut the remaining red
pipe cleaner the appropriate length for query vector 2. For
step 7, use stricter values for the cutoff value. Try a cutoff value of 0.8,
which implies θ = cos-1(0.8) = arccos(0.8) = 36.87°, and a more
lenient cutoff value of 0.7, which implies
θ = cos-1(0.7) = arccos(0.7) = 47.57°.
Take a few minutes now to think about the limitations of this physical
search engine system. For example, the human mind cannot visualize 4 and 5
dimensions easily, much less 30,000 dimensions. Also, how would the weighting
scheme used in the term-by-document matrix affect the results? In this small
Food! example, we used the percentage weighting scheme for vectors. That
is, the ith element of a document vector represents the percentage of that document
pertaining to term i. Suppose frequency weights had been used. Then the number of
times the document mentions term i is stored in vector element i.