Posts Tagged ‘Algorithm’
The selection and placement of stories on this page were determined automatically by a computer program. The time or date displayed reflects when an article was added to or updated in Google News. From the Google News About Page
Google is not ours. Which feels confusing, because we are its unpaid content-providers, in one way or another. We generate product for Google, our every search a minuscule contribution. Google is made of us, a sort of coral reef of human minds and their products.
August 31, 2010 “Google’s Earth” New York Times Op Ed By WILLIAM GIBSON
If you need any proof there’s a cold war raging between collaborative filtering algorithms and human arts reviewers, and that the algorithms are winning, pick up a copy of the Spectrum liftout in Saturday’s edition of The Sydney Morning Herald. Turn to the music reviews, this week on pages 22 and 23. All of the short music reviews end with LIKE THIS? TRY THESE recommendations that I found just plain creepy when I noticed them a few months ago. A human reviewer aping Amazon.com’s CUSTOMERS WHO BOUGHT THIS ALSO BOUGHT THESE recommendations? The featured review has a pie chart of relative values that seem like a data set that an algorithm might consult to generate the BOUGHT THAT? BUY THIS recommendations.
Collaborative filtering algorithms suggest possible paths users might take through gargantuan databases. Another term for them is “social information filtering”. Algorithms are roughly equivalent to Coca Cola’s recipe and are jealously guarded. The general idea, though, is that your data is folded into the larger set of data provided by all users of the system, to suggest items you might be likely to buy on the assumption that humans are predictable, which, apparently, they’ve proved we mostly are. It used to be just YOUR data but now Amazon.com and Apple are pulling in social networking services. Amazon.com is running a beta test with Facebook so that you can be guided by the recommendations of your friends. Apple has introduced a social network, Ping, into iTunes 10 that’s mostly connected to the outside world through Twitter (tech blogs have unconfirmed reports of a deal between Apple and Facebook falling apart).
Amazon.com’s Adsense is the collaborative filtering algorithm closest to what a newspaper is, a collection of information surrounded by advertisements for stuff and services. The Sydney Morning Herald’s algorithmic approximations seemed to me to imply that any dividing line between advertising and editorial had irrevocably broken down, and, like Amazon.com, they’d serve up marketing suggestions within the general body of information. I recognised that it wasn’t the fault of the critics, that they were operating within a formula probably forced upon them by the business side of the newspapers rather than the editors, but yesterday was the first time I had a sense of the cost to the reviewers.
What an algorithm ‘wants’ is for you to feel so comfortable with the information it provides that you’ll make a purchase. What a reviewer should do is surprise, unsettle, or even just plant a seed. On Saturday Jazz critic John Shand reviewed Royal Toast by The Claudia Quintet a band he aligned with the sensibilities of Frank Zappa. His review was informed by having also seen them perform when they toured Australia in May. Jazz is my favourite music and the place where rock and roll overlaps with jazz particularly fascinates me : for example, Gil Evans’s arrangement of the music of Jimi Hendrix, the Laughing Clowns, The Bad Plus’s versions of songs by Blondie and Nirvana, and Cassandra Wilson’s interpretation of Miles Davis’s interpretation of Cyndi Lauper’s “Time After Time”.
I read The Sydney Morning Herald over coffee. I called up the previews of The Claudia Quintet’s album on iTunes and wanted to hear more. A quick Google search told me that they’d been at the Melbourne Jazz Festival, another plus, I admire its broad and adventurous perspective. So a critic had introduced me to something I might not have found on my own. And the LIKE THIS? TRY THESE recommendations turned out not to be marketing suggestions, or at least not very effective ones – Burnt Weeny Sandwich by Frank Zappa and The Mothers of Invention and Bravo Nina Rota by The Umbrellas aren’t available through iTunes and The Umbrellas CD is hard to track down.
And of Bernard Zuel’s review of Grinderman 2? It’s not what I hear but that’s not what’s at issue here. It’s that the pie chart detail – 28% the Devil, 28% the Clown, 28% Birthday Party, 16% Hubert Selby jr – doesn’t allow the critic to explain his references. Whatever I divine of the Birthday Party in Grinderman is on a distorted feedback loop fed into the inspiration Warren Ellis drew from them when he was forming the Dirty Three and back through Nick Cave’s admiration for the Dirty Three. And I wouldn’t say clown but ‘clowns’, Grinderman’s sound has an electrifying texture that aligns with new Bad Seed Ed Kuepper’s revived Laughing Clowns. But we’ll likely never know what Bernard Zuel means by his references.
ALGORITHMS SUGGEST CONTENT ENHANCEMENT FOR BLOG POSTS
A few weeks ago the blogging platform WordPress announced a new feature from a company called Zemanta that would “enhance” posts with “relevant images, videos, and links”.
“We analyse your post through our proprietary natural language processing and semantic algorithms, and statistically compare its contextual framework to our preindexed database of comment.”
The language used by companies seeking to elevate the perspicacity of their algorithms, and inspire us to trust their insights while not giving anything away about how they work is worthy of examination by Don Watson, who has so entertainingly dissected the language corporations misuse.
I don’t use Zemanta for my posts. I prefer to forage for my own “contextual framework” but here’s my concern: how do we fact check the preindexed database?
Venture Capitalist Fred Wilson, whose company Union Square Ventures invests in Zemanta wrote in a blog post:
“If you think about it, Zemanta is ‘adwords for content creators’ … The obvious things would be monetization services (affiliate links, text ads, and even graphical ads), widgets and badges, video, quotes and music.”
I read Fred Wilson’s blog and find him to be thoughtful and engaged and he has a good sense of the eventual usefulness of services that are inscrutable at first and incubates them by allowing them to naturally find their users and establish themselves by being useful. He uses the services he invests in and I deeply admire two of them, the hyperlocal news aggregator Outside.in and Twitter. At the time he invested in them people couldn’t figure out how or why they’d be useful. But this is what you might get if you take an adwords approach to content. This is from Google’s description of its Adsense program:
“Adsense for content automatically trawls the content of your pages and delivers ads (you can choose both text or image ads) that are relevant to your audience and your site content, ads so well matched, in fact, that your readers will actually find them useful … You earn money whenever your visitors click on them.”
This is a small test I’ve made, conducted at random, to suggest that algorithms have difficulty assessing irony and establishing context.
I rely on The New York Times movie reviews. I almost always disagree with their perspective on the movies that become important to me – Blade Runner, Dr Strangelove, Moon – so I now look for their negative reviews to see if there’s something in there I’d like. This is the first paragraph of the 1964 review for Dr Strangelove:
“Stanley Kubrick’s new film, called Dr Strangelove or: How I Learned to stop worrying and Love the Bomb, is beyond any question the most shattering sick joke I’ve come across. And I say that with full recollection of some of the grim ones I’ve heard from Mort Sahl, some of the cartoons I’ve seen by Charles Addams, and some of the stuff I’ve read in MAD Magazine.”
The AdSense ad running next to the Dr Strangelove review:
“Guitar Heaven: The Greatest Guitar Classics of All Time” by Santana.
The AdSense ads running next to the review of Blade Runner:
For acting classes for would be movie actors in Sydney, the movie review section of the Brisbane Times and an Australian production of Noel Coward’s Private Lives.
An AdSense ad running next to the story about Prime Minister Julia Gillard naming former Prime Minister Kevin Rudd as Foreign Minister:
International Flight Sale, tickets to London from $1899.
Algorithmic perversity. God I love that term. I can imagine this era becoming known as ‘The Age of Algorithmic Perversity’ in the way that the 1920’s is ‘The Jazz Age’. It was coined by Alexis Madrigal and jumped out at me from a book review he wrote a couple of weeks ago. (You can find it here.)
I’ve been thinking about algorithms since 1998 when Ken Goldberg, a pioneer of telerobotic art installations on the internet, who is also a Professor with the Department of Industrial Engineering and Operations Research at Berkeley, and now Director of the Berkeley Center for New Media, put his Jester project online.
It was a joke recommendation system, underpinned by an algorithm he’d written (called ‘Eigentaste’) that was set up to do something nearly impossible, given how subjective humour is, to try and figure out what jokes you might find funny. The algorithm seemed to be trying to establish a context for humour beyond rigid categories by asking for answers as a relative value, positioned somewhere between not funny and hilarious.
Ken’s latest algorithm-based project is Opinion Space, being used by the American State Department to gain a sense of how people around the world respond to a range of issues. There are no ‘yes’ or ‘no’ answers, just a slider to indicate how strongly you agree or disagree with a range of questions about, perhaps, nuclear power, women’s rights as a peace issue, climate change, poverty. There’s a question box: What would you suggest to Secretary Clinton? You have the opportunity to respond to other people’s questions and the algorithm rewards you with a high rating if you find insight in perpectives you would normally feel far removed from. Ken envisaged Opinion Space being a tool newspapers could use on online Op Ed pages.
Algorithms are not usually crafted to present you with anomalies, contradictions, serendipity, or that sense of adventure that comes from feeling that you can’t abide classical music, then something Alex Ross writes about a recording made by the Los Angeles Symphony Orchestra conducted by Esa Pekka Salonen makes you want to hear it, and so you buy it, and a new world opens up to you. In the Eigentaste paper Ken writes that another term for ‘collaborative filtering’, which is what happens when the suggestions of all other users are pooled to make suggestions for you, might be “social information filtering”.
Australia has been in a state of suspended animation for the last three weeks. The Federal election didn’t produce a clear winner, the electoral system is based around preferential voting: if your first choice didn’t win, a percentage of your vote goes to your second choice, then a smaller percentage to your third choice, etc. It took three weeks to tally up absentee and postal votes and then either the current Prime Minister Julia Gillard (of the Labor Party) or the leader of the conservative (Liberal) Party, Tony Abbott, needed to form alliances with a handful of independent candidates to tally up 76 seats in order to form a government.
It was a photo finish, with the decisive vote being handed to Julia Gillard by one of the independents. I’m working on writing projects that have me keeping Los Angeles and London time at the moment and I missed the drama of the countdown yesterday afternoon. My twitterstream told me that Julia Gillard had held onto power, but I didn’t know which independent had cast the deciding vote.
When I checked onto Google News at around 8 this morning, Sydney time, the first story was from The Australian Broadcasting Commission suggesting that the Gillard government was already in trouble about a mining tax the previous Prime Minister (from her own party, Kevin Rudd) had suggested a few months ago, the second story was from video content provider ITN, and further down the page another story from ABC Online quoted “Queensland billionaire mining boss Clive Palmer” saying “the majority of Australians want another federal election.” I wanted to know how many people is “a majority?” How did he sense this?
Twelve hours later the top three results for “Australian Election” are from ITN, a content provider Suite101 (that apparently pays writers $1.50 per 1,000 page views) and Reuters.
I was at the Kings Cross Library at noon and checked my Twitter timeline, and the news I wanted to read about the outcome of the federal election was posted by several writers connected with a publication called The Drum that’s part of the Australian Broadcasting Commission.
There were also some tweets from Alexis Madrigal, commenting on algorithms.
One thing I noticed today. Algorithms are going wild. They are making choices for you left and right. We need to think that one through.
Because algorithms work in very particular ways … I’m not so sure their decision making will lead to socially optimal outcomes.
I asked him which algorithms, what kind of wildness, and he replied:
Take Facebook or Twitter’s “Who You may Know”/ “Who You Should Follow” algorithms
Harder to see effects of search algorithms shaping data most likely to be used.
Since breakfast I’d been making notes on how every part of my day is directed and shaped by algorithms:
I’ve wearied of the Sydney newspapers The Sydney Morning Herald and The Australian. The journalists I admire have migrated either to independent online services or publish their own blogs and I follow them on Twitter.
The newspaper feeds I follow through Twitter are from The Independent and The Guardian in London. But I reflexively log on to Google News Australia to see if there’s anything happening in the world and the country and Sydney, generally, that I should know about. That’s how my day starts, over coffee, around 6.30 a.m.
And I don’t trust what I read. I Googled “Google News Algorithm” and the top search was a blog post from Computerworld in 2009, quoting the creator of Google News, Krishna Bharat, saying that “articles are ranked based on originality, freshness, quality, expertise of source and whether a lot of other sources around the Web are pointing to a particular article.” An “About Google News” link from Google itself returned a “404 Not Found”.
This is at the heart of my distrust of marketing algorithms. “Originality”, “Freshness”, “Quality” and “Expertise of Source” are judgments human editors make. Algorithms, like Ken Goldberg’s, can begin by asking a set of questions that try to measure how individuals may define those qualities for themselves but Google doesn’t take that measure of its readers and I’d bet that the mathematical terms underpinning Google’s idea of “originality”, “freshness” etc. don’t match mine, or any editor’s.
I imagine the internet as being a world in the grip of a cold war where the motley spies and black marketeers are running spam empires, plotting world domination. I imagine that most of those “other sources” pointing to stories Google News links to are scam sites, set up with content scraped from legitimate blogs and publications, to siphon off micro-payments from Google’s adsense program.
After I’ve read my news feeds I check the stats on my blog. Every now and then on WordPress, there’s a new variety of search spam or link spam that’s a kind of algorithmic spy “tradecraft”. I’m fascinated by the way these scams evolve. I’m amused by peculiar search terms, but more than three “people” landing at my blog, several days in a row, looking for the phrase “bulldog lifting leg”, which I’ve never used in any of my posts, set alarm bells ringing. I did a bit of reverse searching, and it seems that some of my posts have been scraped, the term “bulldog lifting leg” added to them, and the bots find them through a link to my home page. I report the scams to WordPress and they seem to have a good rapid response unit.
Rosanne Cash tweeted yesterday about Twitter spam she was receiving:
Looks like I have some not-quite-human and not-quite-bot followers. How adorable you are. #AndJustALittleScary.
WordPress now offers a service, through Zemanta that lets an algorithm offer links for you to put within your posts.
“Once you’ve activated Zemanta, you’ll see several new widgets on your edit screen that let you quickly add recommended links, photos, tags, and articles. With just a few clicks your post goes from simple to snazzy.”
I don’t use this service, but I imagine the algorithm having the same kind of parameters as Google’s adsense, looking for keywords and tags, without the granularity of context. I find Adsense both creepily literal and literally creepy.
For years I’ve been tired of reading stories about musicians in the mainstream media. They all follow the same formula: 20 minute interviews in a hotel coffee shop. A little bit of Googled biography. A few quick comparisons — an example of this is a snarky, completely irrelevant, comment left on a profile of Grinderman I wrote for the Huffington Post: “Sounds like a bastard child of David Byrne and T Bone Burnett”.
What worries me about a service like Zemanta is that it will look for conformity, the top rating references will appear in many stories, and the same potted biographies will be in more stories. I fear that writers without any experience with human editors and fact-checkers will unquestioningly ‘trust’ the links they’re served.
It drives me nuts reading The Sydney Morning Herald and The Australian. The Sydney Morning Herald now has added a feature to its music reviews in the “Spectrum” Section of the Saturday edition, where the human writers ape Amazon.com’s collaborative filtering algorithm that suggests other purchases. There’s a LIKE THIS? TRY THESE feature at the end of the capsule reviews and the featured review has a pie chart: The LCD Sound System album This Is Happening, is judged to be roughly 40% Eno / Talking Heads / Bowie, 10% Lou Reed, 20% Giorgio Moroder, 20% Neu, and 10% Suicide.
What we lose with news disappearing behind paywalls, newspapers going out of business, and even proof that tabloids beat up their stories, resorting to wiretaps now as well as paying sources for stories, is the lack of a reliable common narrative. There’s no above-the-fold page-one event each day that we can all point to as a common present moment. Algorithms, targeted and shifting, provide no common ground.
What we miss with both the decline of mainstream media and algorithm defined news are the stories that have no commercial valuebut immense human value: the bad, difficult news, like the community portraits of the underclass and criminal elements in Baltimore David Simon and Ed Burns showed us in The Wire.
My sense of absolute reality in Sydney is now linked to the City of Sydney’s public library system: I look at the leaflets and pick up the few community newspapers that are still published in paper form.
I watched Series 7 of Spooks (MI5 in the USA) over the weekend. Television news reports and the front pages of newspapers are used as exposition, but also as a way to demonstrate that the public record may often be wildly different to the realities of the spies and the politicians.
Sometimes the news is sugar coated to avoid public panic, sometimes it’s deliberate misdirection so that the enemies don’t catch on to how much MI5 has caught onto, sometimes it’s spin from the government. The mainstream spy story relies on a shared, public reality as a straight man.
What I yearn for most in Australia is a digital rights algorithm milled to an extremely fine tolerance that compensates for the byzantine import restrictions and trade protections from the analogue era. If I want to buy Series 8 of Spooks, shown a year ago on British television, and soon to become available there on DVD (in a format regionally incompatible with Australia’s). I want an algorithm to be able to consult a broadcast schedule and determine that if a broadcast station has no intention of screening a series in Australia that I should be able to buy a print-on-demand copy.
On my way home from the library I called into the supermarket to buy some Soy Milk. It’s been undergoing a gradual redesign that I’m both fascinated and repelled by. The store has a large square-footage. The redesign has taken some shelving out, and reduced the amount of items it sells — my favourite white miso paste is no longer stocked. And some of the new shelving, a manger-like display unit for the premium cheeses, for example, seem to be willing us to believe that these changes are to make the joint classier.
There’s been a rearrangement of the items that suggests some kind of algorithmic choreography based on impulse buying probabilities that makes no sense to me but has me making those infinite Ikea loops that have me walking all the aisles. It’s impossible to easily go in and buy one item and get out, but I suppose that’s the point.
A new checkout system has been installed as well. One side is a set of regular checkout counters manned by humans. A voiceover now tells you to move to Aisle 10. The other side is a set of self-check machines that I mostly use but that drive me crazy, the weight tray is either too sensitive or not sensitive enough and freezes, flashing the “assistance required” button.
What’s sad is how willing we are to put up with malfunctioning technology and the constant chatter of synthetic voices “please move item to bagging area”, “please take your change”. When the self-check machine says “thank you for shopping with the fresh food people” I can’t say to it, “this is a supermarket, fresh food, you’ve got to be kidding”. It’s part of the irrelevant machine etiquette we just have to endure, where a microwave oven will tell me to “enjoy my meal” and a bus will tell me to “have a nice trip”.
What mostly bothers me about this algorithmic infestation in our lives is that it’s incapable of irony and lacks wit. Or perhaps not. WordPress’s “possibly related posts (automatically generated)” algorithm could find no matches for my post about algorithms.