A few weeks ago, I posted “Data Crunching for Obama,” a look at the Democratic campaign’s microtargeting strategies led by Vijay Ravindran, chief technology officer at Catalist, Harold Icke’s start-up political technology company that built a national voter database of information on more than 260 million people for progressive groups, including the Obama campaign.
At Catalist, Ravindran led all the technology aspects of developing the company’s software products and services. The data banks and web-based tools he helped develop could answer questions such as: “How many Indian-Americans gave money to me, said they were an Obama supporter, voted in the last general election, own their home and live in Baltimore?â€
Below the fold is a Q&A with Vijay Ravindran, where he talks about his engagement with politics, the 2008 election efforts, Catalist’s role in it, and what South Asian voter data tells us about the “brown” community.
Incidentally, the 34 year old is on a roll. Just yesterday, it was announced that as of February ’09, Ravindran will be the senior vice president and chief digital officer of The Washington Post Company. Per the press release that went out:
“We are fortunate to have Vijay join the Company as we focus increasingly on electronic media,†said Donald E. Graham, chairman and chief executive officer of The Washington Post Company. “Vijay is widely recognized as one of the top innovators in the field. I am delighted that he will bring his extraordinary skills, talent and experience to our efforts to expand our digital business.â€
Back in the days when you were at amazon and prior, could you ever have imagined yourself entering the sphere of politics?
I cared enough to vote, and always had strong feelings about voting Democratic prior to 2004, but I didn’t do anything beyond that. Looking back, I feel somewhat embarrassed that I didn’t appreciate the Clinton years. I was a college 1st year when Clinton got elected to his first term. I was happy, but not as happy as I should have been. And as someone who had the summer off before joining the workforce in 1996, I barely paid attention to the re-election. And volunteering never crossed my mind.
My wife has always been a personal trail blazer for me. She’s the reason I moved to Seattle and took the first job I could find in 1998 (Amazon). She became active in 2000 campaigning for Gore and Cantwell, and protesting the outcome after the election was stolen.
But like so many other people I know, my eyes were opened to how important politics and which political party is in power by the aftermath of 9/11. In small ways, I got more involved in 2004; I gave money for the first time, caucused in Washington State during the primary, and canvassed in South Seattle for Kerry/Gregoire. But my imagination really got stirred when I heard tales from two of my friends who I had worked with at Amazon, who had retired prior to 2004. They had volunteered in Cuyahoga County & DC and actually did technology related work for the Kerry Campaign and the DNC respectively. I had never realized that my day job skills could be put to such relevant use. So that definitely got me thinking, but it was a pipe dream at that point. I’m not much of a career roadmap guy, so I didn’t know what to do with the pipe dream. But then lo and behold, my wife who had been working on her PhD in Seattle got a great faculty offer from the University of Maryland at College Park, and we decided to move to Washington DC. Through my two politically active Amazonian friends, I was connected to Harold Ickes. We hit it off really well, and things just fell into place. It was meant to be.
What is it that you do exactly on a day-to-day basis, in plain English, for those non-quantitative folks out there?
Catalist acquires publicly available data on people and organizes it. The core of that information is voter registration and vote history data (know in political parlance as the “voter fileâ€) acquired from Secretary of State Offices and sometimes at the county level. Catalist marries that information with commercially available demographic information. We then provide web-based data mining tools as well as integrate with leading applications on the Democratic side to support a wide range of civic engagement from direct mail fundraising to door to door canvassing.
Catalist clients can store with Catalist any person level data they collect like the amount of a donation or the response to a canvass question (e.g. Do you support McCain or Obama?). We’ve built technology to match disparate pieces of and they can query their own data alongside the data Catalist collects. So our web-based tools can answer a question such as “How many Indian-Americans gave money to me, said they were an Obama supporter, voted in the last general election, own their home and live in Baltimore?â€
On a day to day basis, our team is constantly updating new voter registration info across the country, uploading customer contributed data, and building new helpful models that synthesize the Catalist dataset in new and useful ways. And we continue to develop new tools to assist end users in querying, visualizing, and using the data for their civic engagement.
Applying “microtargeting” to the Democratic campaign was really “taking a page out of the Republican playbook.” Tell us about the back end implementation of this strategy – your role in it.
The key to microtargeting is the underlying data. Without current reliable data on the electorate, effective predictive modeling is not possible. That’s where Catalist fits in. My role was to build the technology organization within Catalist and be the forward facing advocate for what we were building. I spent a lot of time with Harold Ickes (Catalist’s founder) and Laura Quinn (Catalist’s CEO) in late 2005 and early 2006 convincing investors that it could be done, and that our business plan was sound. And I hired a great team of software developers who wanted to make a difference for Democrats.
How is the work of an organization like Catalist going to change the political landscape in coming elections?
Catalist’s goal is to be a permanent part of progressive infrastructure for Democrats and allied organizations. As more groups use Catalist, the underlying data is enriched and augmented, which results in more effective civic engagement programs for Catalist clients. We like to think of ourselves like the electric company for campaigns. No one organization would have the budget or wherewithal to build Catalist, but by having a large portion of the progressive universe subscribing to Catalist, a service is available that makes everyone better.
For those who feel that what you do is a violation of privacy laws, can you explain why this isn’t the case?
Catalist written policies dictate that we only compile data that is either public information; collected by our clients in the course of ongoing interaction with voters, members and supporters; or through commercial transaction information conducted in the open marketplace. These policies also dictate that we do not acquire, accept or store social security numbers or any personal credit, legal or medical records.
What portrait of the South Asian community did your data paint? Was there a variation by region? By age? By immigration date? (Is there any truth at all to the generalization about South Asians and Hondas/Toyota Corollas?!)
The most interesting data I’ve seen with South Asians is that their voter turnout is disproportionately worse than the general population, while their political donations are disproportionately higher than the general population.
There are a lot of areas I’d love to explore more closely. I think for one, the behavior of 1st and 2nd generation South Asians differ a great deal. Unfortunately, it is hard to differentiate them with the data in its current state amongst the unregistered voting age person population.
I am glad you mentioned the car example, microtargeting in the press has been primarily about detailed item level behavior (e.g. gin & tonic, you’re a Dem). In reality, unless your data source has great coverage across the electorate, detailed information like your car, or yoga is not useful for predicting other people’s behavior (unless you know about everyone’s car and exercise habits too). So most modeling on the Democratic side focuses on attributes that are available on everyone like home ownership or the census characteristics of where they live. It much less sexy than you would be led to believe.
And in the long term, what’s really valuable is people’s actual answers to questions like “Do you want troops out of Iraq?†or “Do you plan to support Barack Obama for President?†It is much better to predict behavior using this collected data than lifestyle data if you have the choice. What’s going on right now is a weak proxy to what will be possible in the future, and I think to most people will be less invasive on the lifestyle front.
I am on the Board of the Indian American Leadership Initiative, which is an organization dedicated to furthering Indian-American Democrats in politics. I would like to see IALI take a prominent role developing political knowledge around South Asians and Indian-Americas. There’s not a whole lot out there. Catalist is very supportive of academic research, so for those of you out there interested, there’s a data set that can be made available for qualified academic pursuit. And it matters politically right now because of how important South Asians are for political fundraising. People sometimes dismiss the clout of South Asians because of our voting numbers, but our relative prosperity makes up for a lot.
What was the single most defining moment in this election campaign for you?
For me, it came from the processing of early vote and absentee data released nightly from counties in key states such as Ohio, Florida, Colorado, and North Carolina.
We hired a nightshift that was managed by my colleague Anupama Pillalamarri. The team was primarily recent law school graduates who had just taken the Bar and were waiting for their results. They had the job of taking files collected in the counties and doing the manual mapping and audit steps for this vital data that was then turned around by the next morning in a usable manner. The processed information was then utilized by key Catalist clients for important field work. I stayed with them on their second night on the job, and was blown away by their dedication and energy. This election was special, and that night I knew it, both because of the energy of the nightshift, but you could see the high participation of African-Americans and the Youth vote already. Walking to my car at 5am out of our office, I felt great, not only to be a small part of such a historic movement this year, but also that our side’s chances were looking strong.
A Washington Post article stated that “As the chief data-architecture guy at Catalist, [Ravindran is] part of a new trend in political technology: As data become more important in campaigning, candidates are increasingly turning to the tech industry for business-level expertise.” Have you found this to be the case in Obama’s approach to building his transition team and how do you see this extending to the new administration?
Absolutely, you see it first hand in the transition team makeup, where there’s a mix of people successful in the business world (e.g. Julius Genachowski) and those who know the inner workings of politics and DC (e.g. John Podesta).
To me, the really differentiating aspect of the Obama campaign was not simply their embrace of new technology and approaches, but in their ability to embrace the new while not throwing out the best of the old. They built a hybrid that worked. They used the new technology to engage people in ways that had been never done before, but they didn’t forget how to do the basics of turning out voters, crafting compelling policy positions and running a competent campaign. We’ve seen examples before of a candidate developing ground-breaking techniques to raise money or attract volunteers, but then fail to deliver the votes. That didn’t happen this time for the Obama campaign and it was no accident. What’s next, both for you and Catalist?
For Catalist, we’re extremely excited to continue to learn from this past cycle from our users, build better tools that allow the data to be used in new and more effective ways. We’re especially excited that many of our clients will be starting up new campaigns in 2009 to mobilize supporters on exciting new legislation such as health care, energy, and worker’s rights.
Personally, I am excited to join a great leadership team at the Washington Post Company. I am eager to help them develop new ways to distribute news and information in this rapidly changing world. Until I get there in February, there’s not a whole lot more to say.
What were some of the most significant efforts or outcomes of Catalist’s efforts in the past election cycle?
We recently released the following stats from this past cycle:
Catalist worked with 90 progressive advocacy, political and not-for profit client organizations.
Catalist clients logged 335.6 million contacts (phone, mail, email, at-the-door, et al) in the 2008 cycle.
These contacts reached more than 126 million unique people, and 259.2 million individual responses were recorded, including 18.3 million presidential ID’s for 12.1 million distinct people; this compares to 13 million presidential ID’s from 8.5 million distinct people in 2004.
Catalist clients submitted 6.6 million voter registration applicants for unique individuals.
Over 250 models (indicating such attributes as likely partisanship, turnout propensity, Obama support, likelihood to be married, likelihood to have a college degree, support for choice, et al) were generated by Catalist and client organizations, resulting in 8.9 billion appended scores.
More than a dozen progressive pollsters drew nearly 2,000 poll samples.
Catalist’s national voter database includes 186 million registered voter records, and 80 million unregistered records for persons over 18 but not registered.
Client organizations have appended and stored 71.9 million membership flags (55.2 million distinct people), and 19.3 million donor flags (8.2 million distinct people).
169 voter files across all 50 states, the District of Columbia, Puerto Rico, and Guam, totaling 769 million individual voter records, were processed and updated by Catalist this year.
65 million absentee and early vote records were processed, representing 12 million distinct people.
Through Catalist’s web-based “Q Tools†clients queried over 5 billion person records, and cut lists containing over 66 million person records.
Catalist collected and appended 9.4 million specialty data flags, such as hunting, fishing, teaching, child care, pilot, doctor, nurse, and other professional licenses, for 8.2 million distinct people.
what a nice person. he’s the kind of boy one could take home to meet mom.
Alas, Shaila, someone already took him home to meet mom…But heck, his work is fascinating…
Vijay’s work on data crunching continues to fascinate me for some strange reason (maybe because of all this election fever, Obama-mania and media hype). Right now I am reading a related interesting article (subscription required) – Why Rich States Aren’t Republican which is kind of a book review of Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do that delves into Vijay’s field
His involvement at the Washington Post will definitely be something to watch. Who knows to what end exactly Obama’s success could be attributed to this interesting mix of technology, information and marketing, but Obama nevertheless won – so I suppose my hope is that this kind of approach, this kind of thinking, will have continued success elsewhere (i.e. a newspaper). American journalism is in crisis and every once in a while appears some new fad that is supposed to save the industry. People continue to bandy about catch-phrases without really understanding them; not only that, they pour money into research and training, but the results are often time-wasters that audiences aren’t receptive to. I’ll look later, but has there been anything about Ravindran in AJR or CJR? Or would it be too early?
I’ve known Vijay for roughly 15 years (even lived with him for a few) and I can say his success is no accident or fluke. He comes from a brilliant family, is hard working, and has an open mind. I know he’ll do great things at WaPo and beyond. Congrats!
Not to detract from Catalist’s efforts – but that list above is so much about how much data was collected. What would really be interesting is to see how they sliced and diced the data – was it classic dimesnioning , like major retailers have been doing for years? Probably not. It would be fascinating to read if there was actual heuristic analysis built around subjective data points*. That’s an oxymoron, i know. Maybe not in a 2.0 world – amybe Catalist could tell us more.
*Just my aging mind trying to dredge up decades old comp.sci. mumbo-jumbo.
@neale, they probably don’t want to give away all their techniques.
On topic, people. And the topic is not who got hacked.
The other day I was reading somewhere about future complications related to confirmation hearings and vettings processes for official positions for the next generation with so much of individual’s private life on cyberspace. As recent events suggest [After the Crash: How Software Models Doomed the Markets ] the net will begin to influence real life. With “Second life” already becoming a craze, neuroscientists are conjecturing that the Brain is now in evolution mode. Vijay’s work if extended will actually be a useful tool for scientific data analysis where there is a data overload due to increasing computational power.
7 · shaila said
Because of [ Crime of Reason ]
shaila, an amusing (digressionary perhaps)anecdote from the above book related to your comment
Washington Post Hired Left-Wing Obama Enabler as Its ‘Chief Digital Officer’ http://newsbusters.org/blogs/brent-baker/2009/07/03/washington-post-hires-left-wing-obama-enabler-its-chief-digital-officer