gtag('config', 'G-0PFHD683JR');
Price Prediction

Why do new artificial intelligence agents choose to reduce navigation on HTML?

Artificial intelligence agents take care of the world, and put the next big step in the development of artificial intelligence 🦖. So, what does all these factors share? They use Markdown instead of raw html when processing content on web pages ⛓. My curiosity to find out why?

This blog post will show you how this simple trick can provide you with up to 99 % of symbols and money!

Artificial intelligence agents and data processing: Introduction

Artificial intelligence agents They are software systems that harness the strength of artificial intelligence to accomplish tasks and follow -up goals on behalf of users. Equipped with thinking, planning and memory, these agents can make decisions, learning and adaptation – all of them on their own. 🤯

In recent months, artificial intelligence agents have launched, especially in the browser automation world. These AI’s AI’s agent browsers enable you to use LLMS to control software browsers, and to automate tasks such as adding products to the Amazon 🛒.

Have you ever wondered about AI’s libraries and frames from AI, such as Crawl4ai, ScrapeGRAPHAI and Langchain?

When processing data from web pages, These solutions often turn HTML into an automatic reduction– Or display ways to do this – before sending data to LLMS. But why do artificial intelligence agents prefer discounts on HTML? 🧐

Why?Why?

The short answer is: To save the symbols and accelerate the treatment!

It is time to dig deeper! But first, let’s take a look at another approach used by artificial intelligence agents to reduce data download. 👀

From excessive data to clarity: The first step for artificial intelligence customers

Imagine that you want your artificial intelligence agent:

  1. Contact the e -commerce site (for example Amazon)

  2. Find a product (for example PlayStation 5)

  3. Extract data from this specific product page

This is a common scenario for the artificial intelligence agent, as the renewal of e -commerce is a wild trip. After all, the product pages are chaos chaos for constantly changing changes, making software data a nightmare analysis. This is the place where the factors of artificial intelligence praise their great powers, and benefit from LLMS to extract data smoothly – it does not matter how the page structure is chasing!

Now, let’s say that you are on a mission to seize all the modern details of PlayStation 5 page page On the Amazon 🎮:

PlayStation 5 amazon pagePlayStation 5 amazon page

Here’s how to drive AI’s agent browser to achieve this:

Navigate to Amazon's homepage. Search for 'PlayStation 5' and select the top result. 
Extract the product title, price, availability, and customer ratings. 
Return the data in a structured JSON format.

This is what the artificial intelligence agent should do (hope 🤞):

  1. Open the Amazon in the browser 🌍

  2. Look for “PlayStation 5” 🔍

  3. Determine the correct product 🎯

  4. Extracting the product details from the page and returning it in JSON 📄

But this is the real challenge –Step 4. Amazon PlayStation 5 page is a monster! HTML is full of tons of information, which you don’t even need.

Want a guide? Full HTML copy of the page from the DOM page to the browser and drop it into a tool like the distinctive code calculator LLM:

The result of the distinctive symbol Calculator.netThe result of the distinctive symbol Calculator.net

🚨 Take yourself …

896,871 icon!896,871 icon!

896,871 symbols?! 😱 Yes, I have read this correctly-ninety-six thousand, and eight hundred and seventy of the distinctive symbols!

This huge load of data – AKA tons of money! 💸 (more than $ 2 per request on GPT-4O! 😬)

Listen to Joe Bastanch ...Listen to Joe Bastanch ...

As you can imagine, passing all these data to the artificial intelligence agent comes with great restrictions:

  1. You may require installment/supporter plans to support the use of a highly distinctive symbol 💰
  2. It costs a fortune – especially if you are running frequent inquiries 🤑
  3. It slows down responses because artificial intelligence must address a ridiculous amount of information ⏳

Reform: fat trimming

Most artificial intelligence agents allow you to determine the CSS specific to extract the relevant departments only from the web page. Others use the algorithms of affairs for the automatic candidate content-such as stripping heads and appetite (which usually does not add any value). ✂

For example, if you check the Amazon PlayStation 5 product page, you will notice that most useful content lives within the HTML element that you have selected #ppd CSS specific:

#PD HTML element#PD HTML element

Now, what if you told the artificial intelligence agent only focus on #ppd An element instead of the entire page? Will this make a difference? 🤔

Let’s put it on the test in the confrontation face to the face below! 🔥

Markdown Vs HTML in the processing of artificial intelligence data: comparison face to face

Compare the use of the distinctive symbol when processing a portion of the web page directly in exchange for converting it into a reduction.

Html

In your browser, copy HTML from #ppd The element, and drop it into a symbolic calculator:

309,951 icon, this time309,951 icon, this time

From 896,871 symbols to only 309,951 –Nearly 65 % save!

This is a significant decrease, for sure, but let’s be real – still a lot of distinctive symbols! 😵‍💸

Price reduction

Now, let’s repeat the trick used by artificial intelligence agents by taking advantage of the HTML converting tool to Markdown online. But first, remember that artificial intelligence agents perform some pre -treatment to remove important signs of content such as and and tags const scriptRegex = /

Next, copy HTML, which was cleaned and converted into discounts using the HTML converter to Markdown online:

HTML to reduceHTML to reduce

The resulting reduction is much smaller but It still contains all important text data!

amazing!amazing!

Now, paste this reduction in the distinctive LLM code tool:

7,943 codes!7,943 codes!

Boom! 💣 From 896,871 symbols to 7,943 symbols only. These are savings ~ 99 %!

What a result of the bombing of the mind!What a result of the bombing of the mind!

By removing the basic content only and converting HTML to Markdown, you have a more size load, lower costs, and faster processing. Great victory! 💰

Markdown vs html: The Battle for Tokens and Cost Saves

The last step is to check that the text text still contains all the main data. To do this, pass it to LLM with the last part of the original claim, and here is the result of Json you will get:

{
  "product_title": "PlayStation®5 console (slim)",
  "price": "$499.00",
  "availability": "In stock",
  "customer_ratings": {
    "rating": 4.6,
    "total_ratings": 5814
  }
}

This is exactly what your artificial intelligence agent will return - Spot on!

For a quick overview, check the final summary below:

road

Symbols

The price of O1-Mini

GPT-4O-MINI Price

GPT-4O price

HTML is the entire

896,871

13.4531 dollars

0.1345 dollars

$ 2.2422

#ppd Html

309,951

$ 4.6493

0.0465 dollars

0.7749 dollars

#ppd Price reduction

7,943

0.0596 dollars

0.0012 dollars

0.0199 dollars

Where the agents of artificial intelligence fail

All of these symbolic tricks are useless if the artificial intelligence agent is banned by the target site 😅 (I saw ever how AI captcha be honey? 🤣).

Why does this happen? basic! Most sites use anti -deployment measures that can easily prevent browsers. Do you want a complete collapse? Watch the next web symposium below:

If you have followed our advanced webs guide, then you know that the problem is not with browser automation tools (libraries that operate artificial intelligence agents). no, The real perpetrator is the browser himself. 🤖

To avoid a ban, you need a specially designed browser for cloud automation. Enter the scraping browser, browser:

  • It is operated in the position of the head just like the regular browser, which makes it difficult for the anti -detect systems. 🔍
  • It is closer to the cloud, which saves you time and money on the infrastructure. 💰
  • Captcha automatically replaces, deals with the browser's fingerprint, dedicates cookies/heads, and tries to retry matters smoothly. ⚡
  • IPS rotates from one of the largest and most reliable agent networks there. 🌍
  • Smoothly integrates with famous automation libraries such as theatrical writer, selenium and dolls. 🔧

Learn more about the Bright Data Data browser, The perfect tool for integration into artificial intelligence agents:

Final ideas

You are now in the episode about the reason for the use of artificial intelligence agents to reduce data processing. It is a simple trick to provide symbols (and money) while speeding up LLM processing.

Do you want to work artificial intelligence without hitting the blocks? Take a look at the BRIGHT DATA tool collection for AI! Join us to make the Internet accessible to everyone - even through the browsers of the automatic artificial intelligence agent. 🌐

Even next time, keep the web freezer! 🏄‍

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button