Giter VIP home page Giter VIP logo

cosmic-cell-lines-webcrawler's Introduction

COSMIC-cell-lines-webcrawler

为了使用oncopredict来预测肿瘤药物,首先我需要详细了解COSMIC数据库中的所有细胞系信息。为了实现这一目标,我采用了Python编程语言,并结合requests、lxml、pandas和BeautifulSoup4等库来进行数据爬取和处理。

在开始爬取之前,我首先分析了COSMIC数据库的网页结构,找到了包含细胞系信息的链接。随后,我使用requests库发送HTTP/GET请求,获取这些链接的网页内容。

使用lxml库,我能够从网页内容中提取所需的细胞系详情数据。将数据存储在pandas的数据框架中,使其便于进行后续处理和分析。

由于细胞系数量较大,我编写了批量爬取的代码,自动处理所有细胞系的链接。这样一来,我能够高效地获取大量数据,为后续预测工作打下了坚实的基础。

经过数据爬取和处理,最终我成功获得了1020个细胞系的详细数据(CSV),包括与疾病相关的信息,这些数据将成为使用oncopredict进行肿瘤药物预测的重要依据。

To utilize oncopredict for tumor drug prediction, I needed to acquire detailed information about all cell lines in the COSMIC database.

I employed Python, along with the requests, lxml, pandas, and BeautifulSoup4 libraries, to scrape the required links for all cell lines.

The links were crawled in bulk, using a batch processing approach, to efficiently gather the data.

In the end, I successfully obtained detailed data on 1020 cell lines, including their associated diseases and other relevant information. These data serve as a crucial foundation for conducting tumor drug prediction with oncopredict.

cosmic-cell-lines-webcrawler's People

Contributors

pheonixchen avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.