http://www.nytimes.com/2013/01/25/sports/25iht-sumo25.html?_r=2& The first paragraph is cut out. The html has multiple articleBody css classes and we could be using that to help keep the content.