Skip to main content

ADABOOST

AdaBoost This blog post will provide you with a comprehensive overview of Adaboost, exploring the theory behind this probabilistic algorithm and demonstrating its implementation using Python libraries. Dive in to uncover the advantages and disadvantages of neural network, as well as its real-world applications across various domains. With that, enjoy your journey in QDO! What is  Adaboost AdaBoost (Adaptive Boosting) is an ensemble learning technique that combines multiple weak classifiers (often decision trees) to create a strong classifier. It works by training the weak classifiers sequentially, giving more weight to misclassified instances at each step so that subsequent classifiers focus more on the harder cases. The final prediction is made by combining the weighted votes of all weak classifiers. AdaBoost is effective at reducing bias and variance, and it’s particularly good for binary classification problems. However, it can be sensitive to noisy data and outliers. Concepts o...

WEB SCRAPPING

Web Scrapping

Figure 1: Web scrapping

This blogpost will brief you the concept of Web Scrapping, a technique that assist in gathering information online in the nick of time. After reading this blogpost, you will learn the concepts of web scrapping , its advantages and disadvantages that comes with it as well as the implementation of web scrapping in both python and real life by companies to maintain their operation.

What is Web Scrapping

Web scraping is the automated process of extracting data from websites. It involves using software or scripts to navigate through web pages, retrieve specific information, and store it for analysis or other purposes. This technique is often used to gather large amounts of data quickly and efficiently from publicly accessible web pages, such as product prices, user reviews, or social media content.

Concept of Web Scrapping

The concept of web scraping revolves around mimicking the behavior of a user browsing the web but in an automated manner. A scraper sends requests to a website, retrieves the HTML content, and then parses this content to extract the desired data. This data can be stored in a structured format like CSV or a database for further processing. Web scraping typically involves understanding the structure of web pages and using libraries or tools to navigate and extract information programmatically.

Implementation of Web Scrapping in python

Importing necessary modules

Figure 2: Importing necessary modules

Checking requests and url


Figure 3: url for web scraping


Figure 4: Testing request response

Figure 5: Display Response


Figure 6: Response in HTML Text

Figure 7: Get title of the page

Figure 8: Page title


Web scrapping

Figure 9: creating data list


Figure 10: Scrapping information 

Display result


Figure 11: Displaying information in table format

Advantages and disadvantages of Web Scrapping

Advantages

  • It allows for the rapid collection of large datasets from multiple sources, saving time and effort compared to manual data collection.
  • Web scraping can provide real-time or frequently updated data, which is essential for tasks like monitoring prices or tracking trends.
  • Compared to purchasing datasets or using paid APIs, web scraping can be a more economical way to gather the necessary data.

Disadvantages

  • Scraping certain websites may violate their terms of service or intellectual property laws, leading to legal challenges.
  • Scraped data might be incomplete or inaccurate if the website structure changes or if the scraper encounters errors.
  • Websites may implement anti-scraping measures like CAPTCHAs or rate limiting, making it difficult to scrape data effectively.


Application of Web Scrapping in real life

Price monitoring

Figure 12: Price Monitoring

Many companies use web scraping to monitor competitor prices on Amazon, allowing them to adjust their own pricing strategies in real-time.

Property Aggregation

Figure 13: Property Aggregation

Real estate websites like Zillow use web scraping to aggregate property listings, prices, and trends from various sources to provide comprehensive market insights.

Finance

Figure 14: Finance

Financial institutions like Bloomberg use web scraping to gather financial data from various online sources to feed into their analytics and trading algorithms.

Comments

Popular posts from this blog

PRINCIPAL COMPONENT ANALYSIS (PCA)

PRINCIPAL COMPONENT ANALYSIS (PCA) Figure 1: PCA This blogpost will bring to you the concept of principal component analysis which is one of the commonly used descriptive analysis that emphasizes of dimensionality reduction. You will learn how to implement this machine learning model in python, its advantages and disadvantages as well as how companies benefits from this machine learning model. What is PCA PCA is a statistical dimensionality-reducing technique. It takes a large set of variables and transforms them into a smaller set, retaining most of the information in the large set. This can be done by identifying the directions along which the data varies the most. These components are orthogonal to one another, capture the maximum possible variance within the data, and hence form a powerful tool for the simplification of datasets without loss of essential patterns and relationships. Concept of  PCA One of the key concepts behind PCA concerns diminishing the complexity of high-di...

LINEAR REGRESSION

 LINEAR REGRESSION Figure 1: Linear regression figure This blogpost will walk you through the concept of linear regression which is another machine learning model under the regression category of supervised learning. Introducing the parameters that you can turn while applying the logistic regression as well as the factors that play a significant impact upon the performance of the linear regression. What is linear regression Linear regression is a machine learning algorithm that could be used in predictive analysis. From predicting prices of houses to sales forecasting, linear regression is undoubtedly the first choice to many data scientists to implement within the dataset. In short, linear regression involves plotting your data on the graph base on the x and y coordinate and proceed to draw the best fit line upon the graph. The best fit line will be used as a reference to predict the independent variable in the future. However, do you have the skill to conduct a excellent analysis...

DECISION TREE

 DECISION TREE Figure 1: Decision Tree      This blogpost aims to introduce to you regarding to a machine learning model called decision trees. After reading this blogpost, you are able to deepen your knowledge on the concepts of decision trees model, its terminology, pros and cons as well as its application in real life scenarios that lends a hand in solving complex problems thus boosting the living quality of many.  What is decision tree      Imagine you’re wondering through a forest, each path branching off into multiple directions, and you need to make a series of decisions to escape the forest. Now, picture having a map that not only shows you all possible routes but also guides you on the specific conditions you encounter. Decision trees model which applies various splitting criteria's within the branches assists the user in decision making purposes. Compared to regression models which applies complex mathematical formulas like logistic regr...