Quantcast
Viewing all articles
Browse latest Browse all 4448

How to extract the rating of a movie from imdb using scrapy in python

@pagal_guy wrote:

Hello,

I am using scrapy to extract data about a movie from imdb. However the rating is contained within:

Image may be NSFW.
Clik here to view.

My code for extracting the ratings using scrapy is:

from scrapy.spiders import Spider
from scrapy.selector import Selector
from imdb.items import ImdbItem


class ImdbSpider(Spider):
    name = "imdb"
    allowed_domains = ["imdb.com"]
    start_urls = [
        "http://www.imdb.com/title/tt0068646/reviews?ref_=%20best",
        
    ]

    def parse(self, response):
        sel = Selector(response)
        #ratings = sel.xpath('//[@id="tn15content"]/div/img')
        ratings = sel.xpath('//div[contains(@id,"tn15content")]/div/img')
	items = []

        for rating in ratings:
            item = ImdbItem()
	    item['rating'] = rating.xpath('/@alt').extract()
            items.append(item)	
	
	return items

However nothing is being returned but there is no error.

Also the stored csv feed count is being showed as 10 which is the number of reviews in a single page however the csv file is empty.
I am not being able to understand how to extract the 10/10 figure from within the img tag.
Can someone please help me with this??

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles