Tag Archive for 'web-scraping'

Chase, Mint viability, and web scraping

Chase VerificationIt is interesting how much attention this small company captured over the past week. Search for mint money management on Google and you get almost 2 million results. I made my own contribution just yesterday where I was questioning the viability of this service. Using web scraping for obtaining financial data is one of the problems.

Web scraping is a technique that allows web services (often called web bots) to take web pages apart and filter out information, banking transactions in this case. My understanding is that when Yodlee is doing it on Mint’s behalf, the banks don’t necessarily know their pages are being scraped. (more on web scraping from Wikipedia)

I gave it some more thought today and I think I know why Mint (and Yodlee) are having problems fetching Chase credit card transactions. If you have an account with Chase, you probably noticed the extra layer of security Chase introduced recently. Now every time you access your account from a new computer/browser you have to confirm that this is a trusted location by entering a verification code they send to your e-mail or cell phone. Since Mint (Yodlee) has to login to your account before they can scrap transactions from the web pages, they have to go through the same verification process as you do and since they have no access to your email they obviously can’t pass it.

The bottom line is, Mint will continue having these issues unless Yodlee is able to strike a deal with Chase (and other banks in the future) to allow them retrieve your transactions through other means or passing around this extra security verification. Is it likely? I don’t know but from my own experience I can tell you a lot about how slow and self-protecting large corporations are.

This situation probably puts Mint in very uncomfortable position at very inconvenient time. It is ironic how life of a startup can depend on such a seemingly small thing. Web scraping has always been a shady business and I am surprised that Yodlee has gone with it so far.

Related: Web 3.0: When Web Sites Become Web Services from ReadWriteWeb

Share your bargains