@pagal_guy wrote:
Hello,
I am using scrapy to extract data about a movie from imdb. However the rating is contained within:
Image may be NSFW.
Clik here to view.My code for extracting the ratings using scrapy is:
from scrapy.spiders import Spider from scrapy.selector import Selector from imdb.items import ImdbItem class ImdbSpider(Spider): name = "imdb" allowed_domains = ["imdb.com"] start_urls = [ "http://www.imdb.com/title/tt0068646/reviews?ref_=%20best", ] def parse(self, response): sel = Selector(response) #ratings = sel.xpath('//[@id="tn15content"]/div/img') ratings = sel.xpath('//div[contains(@id,"tn15content")]/div/img') items = [] for rating in ratings: item = ImdbItem() item['rating'] = rating.xpath('/@alt').extract() items.append(item) return items
However nothing is being returned but there is no error.
Also the stored csv feed count is being showed as 10 which is the number of reviews in a single page however the csv file is empty.
I am not being able to understand how to extract the 10/10 figure from within the img tag.
Can someone please help me with this??
Posts: 1
Participants: 1