Skip to content

wanghoppe/Maoyan_price

Repository files navigation

Maoyan_price

Scraping movie price from m.maoyan.com. See Main.ipynb.

Since the website use Ajax to dynamically load the webpage, used selenium and phantomjs to run the javascript inside the html.

Maoyan also hide the price information using self-defined font (woff file embedded in the html) to mapping characters, such as to the number 5. I used the convert method from ImageMagick to generate a 30dp x 20dp .jpg image file( in the mapping_num folder) and recognized the number within using a 3 layer neural networks. The training data source and the training of the neural networks can refer to the Training Neural Network.ipynb inside the training folder.

Sample data

data.sqlite: sample data scrape from http://m.maoyan.com/shows/881?v=yes (881 stands for the cinema id in maoyan)

Required

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published