Archive for the ‘Technology’ category

Netflix Recommendations: Why a librarian is still needed

June 27th, 2011

I love recommendation algorithms. Algorithms are what programming is all about. You take information and swirl it around and get something new.

I appreciate algorithms daily. Any time Pandora takes my thumbs up of Howard Shore and Danny Elfman and gives me Harry Gregson-Williams, I have a new composer that I wouldn’t normally have discovered.

I really like Netflix’s algorithm for recommending movies, but a pretty funny/bad recommendation happened on two separate occasions, so I thought that it was worth looking into. I have reviewed 1,302 titles in Netflix as of this writing. (I’m a fan of Netflix.) Netflix then gives me 4,342 recommendations (today) that I sift through to find something to watch.

This strikes me as an example of where a human librarian would be a great asset. It’s all about the algorithm.

Amazon makes recommendations based on what previous users bought mixed in with sponsored searches paid for by advertising firms. In one Amazon purchase, I bought a textbook during my Masters program, a Dragonball manga for my brother, and a travel guide of Nigeria for my sister-in-law. You should have seen Amazon strain its mechanical brain trying to figure out what in the world I was doing.

Pandora is called the music genome project because it breaks down each song into its most basic elements and recommends other songs based on similar characteristics. But just because one song has a good bass line doesn’t mean that I will like another, although it’s a good start.

Songs are much more than elements – there are nostalgic feelings that make me like a song even if it is some cheezy Boys II Men song I danced to at my own 8th grade promotion that Pandora has no clue about. I could do a search for more songs from that time period and hope to find more results, but it’s different describing that song to a person and connecting with them.

So how does Netflix’s algorithm work? Netflix had a contest in 2009 for programmers to write a better recommendation algorithm. If you read some of the papers found there, you’ll see that Netflix gave the teams 100 million ratings done by users. That’s a lot of data and my applause to the teams that tackled 4.2 million users’ worth of information. Netflix gave teams a probe set, a quiz set, and a test set. The probe set of information is a full list of a user’s ratings of shows. On this day, this user said this movie was this many stars. The quiz set was a list of movies that the programmers had to predict the user ratings for. They would post those predictions and Netflix would say how close they got. The test set was the final copy of the algorithm, where programmers didn’t know how well they did until the winner was announced.

BellKor’s Pragmatic Chaos won the contest. Their program took into account ratings on specific days and chunks of weeks at a time. They looked at how a movie may share the same elements (like Pandora’s model) as another movie of the same genre, and matches the genre requirements perfectly, but is just a bad movie. That’s a tough thing to write into a program. The programs I write always use brute force through conditional statements. If I wrote the Netflix algorithm, it would be:

If movie == “Troll 2” then RunAway();

I am in awe of BellKor’s team. (They are a part of AT&T’s research team, though, so they’re not just some random college kids who were bored one summer and decided to write ubercode.)

All that being said, how does this relate to librarians? Netflix has one of the most sophisticated algorithms for recommendations that I have ever seen. It’s a thing of coding beauty.

And yet…

  1. How come, when I rated a Brian Regan stand-up comedy routine highly, it instantly recommended a documentary about monks who live in isolation in the mountains?
  2. How come it said, “Based on your rating of Inception, we recommend Back at the Barnyard“? What about a Chris Nolan thought-crime movie makes you think that I want to see dancing cows?

Those recommendations were made months apart, so it’s not like Netflix just decided to be wacky this weekend.

Rating predictions are great, but there are still some flaws. Good librarians help you sift through that data, analyze what previous users have done, evaluate elements of genres, and make decisions filtering trends over time. That and most provide a listening ear when you ramble about Boyz II Men, even if they think you’re crazy.

Rendering Spanish Versions of Animated Movies

June 22nd, 2011

I love DVDs.

When they first came out, the big push in my neighborhood came from Hollywood Video, a video store franchise that has long since gone out of business. Their main argument was that DVDs offered so much more content for movie fans. When I bought my first DVD, the only bonus feature it had was the original theatrical trailer. Big let-down. Now, they have tons. Most Disney animated features include some sort of game. I’m sure people play them, but that’s not how we spend our free time.

One thing that I noticed today, though, was an awesome feature for those who enjoy other languages. If you’re like me and have a preschooler at home, I’m willing to bet that Tangled is in your rotation of Movies the Kids Can Watch that Won’t Drive Me Bonkers but May Very Well If We Watch It One More Time.

Having watched it a bajillion times, I was looking for some variety. Before the main menu, the DVD gives you three options: English, Descriptive English, and Spanish. Descriptive English is like VoiceOver on the Mac. It narrates everything that’s going on, which is especially fun during Tinkerbell’s flyover of the Disney logo. “There’s a burst of light and then pixie dust.” I visited with my friend at Accessibility Insights one day and that’s his entire computer experience. Every little detail on the Internet is read off at super-human speed. “Page load at 40%. Page load at 45%”. Crazy-making.

It was one subtle feature of the Spanish version that made me pause today. A character is holding up a wanted poster and I realized that every piece of text in the Spanish version is in Spanish. I know that sounds obvious, but it wasn’t always the case.

This is huge for me, someone who bought Spanish VHS tapes all through college. VHS never gave me the option to watch it in English if I felt like it. I’d have to buy another copy, which I was too cheap to do.

Rendering text in another language means that Disney re-did a scene. That costs time and money, albeit not as much time since the words are just a texture map applied to a 3D model. But Disney could have gone the easy route and put the Spanish translation of the wanted poster in subtitles like many movie companies have done. I know that Disney hired big-time voices from Latin America for the Spanish audio track on the DVD. The soundtrack is great. They re-wrote all of the songs so that the lyrics rhymed/flowed well. I’m just impressed that, if I didn’t know the original movie was in English, I could watch the entire thing in Spanish seamlessly.

Me on the Web

June 16th, 2011

Admit it. You’ve Googled yourself.

If I were to say that to my grandparents when they were alive, they would have called me crazy. Google used to mean a ‘1’ followed by a hundred zeroes. (It also was a book in 1913 about crazy creatures.)

But now people Google themselves to see what others are saying about them on the Internet. When I search ‘brian griggs’, this site is the first result, which is cool. The first YouTube video, though, is about a double homicide. Yeah, that’s not me. Hopefully people don’t assume it’s connected to me.

And that’s what Me on the Web, released today by Google, is all about. It’s a new feature for your Google dashboard that tracks all of your different online presences (Twitter, Facebook, LinkedIn, and whatever other social networks you belong to (a cupcake ning, perhaps?)) You can now see when people are looking at your online posts and even get notified when people mention you. Check out more here.

Now, Google doesn’t run the Internet, contrary to some students’ beliefs. When I ask them to cite their sources, they put down “Google”. Google is the search engine. It points you towards information. (Complicating this is all the stuff that Google does own, like Google Pages and YouTube.) If you find bad stuff about you online, Google can’t just go into someone’s site and remove the content. There is a URL removal request form that you can use to have Google take results out of a Google search, but if other sites have grabbed the unwanted stuff, it’s already too late.

At the start of the year, Google was ordered by the Spanish Data Protection Agency (AEPD) to take down outdated information from search results. The AEPD is calling it the Right to Forget. A man had been charged with a crime and was then acquitted, yet only the articles about him being accused of the crime came up in results. Google argued that their product only points to stuff and that it’s the responsibility of the content publishers to take down the unwanted data.

I write book reviews. Usually I stay pretty polite, since I understand that book enjoyment has a subjective element to it (and the whole “if you don’t have something nice to say” thing my mom always lectured me on). Also, books that I pick up to read will generally be worth a review. I’ve been in the business long enough to spot a completely blah book before I get too far in the book.

Let’s say that I have a huge criticism of a book in a review.

J. K. Rowling is going to make a huge announcement here. Let’s say it’s a book I don’t like and I wrote a review about how there are too many owls in the story and the whole thing is old hat. Rowling gets mad and asks Google to take away my search results. No matter what, I’m the one that put the information on the Internet and since I didn’t break any laws, it stays online. It just gets a little tougher to find if someone Googles it.

Me on the Web is a good tool to keep yourself informed about what other people are saying about you, if only to automate Googling yourself, but, like always, the key concept is to be careful because whatever goes online usually stays online in one form or another.

Veezzle for Free Stock Photos

June 14th, 2011

I know that every time we make a flyer at school, we don’t use student photos and we don’t use copyrighted images. (The stock photo watermark does not add to professionalism. Trust me.)

So what if you’re tired of the Microsoft fist pump guy?

Try searching Veezzle.

Clean your projector’s filters.

April 28th, 2011

image

If you work in a dusty area, clean your LCD projector’s filters before the recommended time to avoid bulbs burning out early. The filters are usually located on the sides and look like a grill. The plastic piece pops off easily to reveal the foam that catches the dust. If it gets too dusty, the machine will overheat.

Just to be safe, go clean yours now.

The Cuckoo’s Egg by Cliff Stoll

April 26th, 2011

Cuckoos lay their eggs in other birds’ nests and expect someone else to raise their young. In The Cuckoo’s Egg by Cliff Stoll, astronomer-turned-sysadmin Stoll discovers 75 cents worth of computer time on a spreadsheet unaccounted for in the user logs. Someone logged in, but there’s an error. Like any good scientist, Stoll picks apart the computer code and sees that it’s working just fine. As he digs deeper, he realizes that a hacker has been in Berkeley Astronomy Labs.

Stoll’s conflict between freedom and order, between his college radical roots and his admin duties, is what creates the character development and makes Stoll a relatable narrator. This is a true story of a computer crime case that happened in 1986. Stoll published the book in 1990 and has many details from his logbooks included in the story.

I can remember being online for the first time in 1994. I had a teacher who ran a bulletin board service and I dialed in my 2400 baud modem to connect directly to his computer. The Cuckoo’s Egg is great for tech nostalgia. Usually I want a tech book that is extremely current, but sometimes it is important to see our roots. The epilogue is my favorite part as Stoll recalls a new threat: a worm embedding and spreading across the Arpanet, the Internet’s grandpa.

This, kids, is a floppy disk. This particular one holds tech-deadly source code.

Even if you’re not a total computer fanatic, there are parts to enjoy about the book. I do feel, though, that having a decent knowledge of computing greatly enhances the suspense when you’re able to appreciate the nontraditional techniques the astronomer uses to capture a hacker who has ties to a high-powered government agency.

Facebook Safety Center

April 19th, 2011

Facebook is adding more features to its Safety Center to allow students to report bullying not only to Facebook employees but also to someone in their network. Those people in their network will be more capable of addressing issues in the face-to-face world.

There will also be a teacher’s guide coming out written by Linda Fogg Phillips, B.J. Fogg, and Derek Baird to help educators stay current on social media. The best way to stay updated is to use it yourself, but hopefully this will be a bridge for adults who lack technological confidence.

Many times the bullying online is a sign of trouble offline, so, like always, the best practice for teachers is to know your students and watch for signs of bullying.

App Inventor from Google is like Scratch

April 9th, 2011

The App Inventor, an in-browser emulator to write your own Android Apps, looks very similar to programming in Scratch:

Very intriguing…

bit.ly – Did you ever wonder where the .ly came from?

April 8th, 2011

I use tinyurl.com to shorten most of my URLs, but I will sometimes use clients that have bit.ly shorten the web addresses. I just learned something new today: the .ly doman names originate in Libya.

Here’s what the Wall Street Journal has to say:

The .ly domain is controlled by Libya’s General Post and Telecommunications Co., whose chairman, Mohammed el-Gadhafi, is the dictator’s eldest son. It says it has rented out more than 10,000 .ly domains, either directly or through resellers.

I knew that .tv belonged to Tuvalu, but I don’t think the U.S. will ever have a military presence on that island nation.

You Make Me Sick!

April 4th, 2011

The winners of the STEM video game challenge have been announced.

The professional developer prize went to Filament Games for their web-based You Make Me Sick! It’s made in Flash and is a good example of a simple game done right.

Here are the student winners.