gtag('config', 'G-0PFHD683JR');
Price Prediction

Why do new artificial intelligence agents choose to reduce navigation on HTML?

Artificial intelligence agents take care of the world, and put the next big step in the development of artificial intelligence ๐Ÿฆ–. So, what does all these factors share? They use Markdown instead of raw html when processing content on web pages โ›“. My curiosity to find out why?

This blog post will show you how this simple trick can provide you with up to 99 % of symbols and money!

Artificial intelligence agents and data processing: Introduction

Artificial intelligence agents They are software systems that harness the strength of artificial intelligence to accomplish tasks and follow -up goals on behalf of users. Equipped with thinking, planning and memory, these agents can make decisions, learning and adaptation โ€“ all of them on their own. ๐Ÿคฏ

In recent months, artificial intelligence agents have launched, especially in the browser automation world. These AIโ€™s AIโ€™s agent browsers enable you to use LLMS to control software browsers, and to automate tasks such as adding products to the Amazon ๐Ÿ›’.

Have you ever wondered about AIโ€™s libraries and frames from AI, such as Crawl4ai, ScrapeGRAPHAI and Langchain?

When processing data from web pages, These solutions often turn HTML into an automatic reductionโ€“ Or display ways to do this โ€“ before sending data to LLMS. But why do artificial intelligence agents prefer discounts on HTML? ๐Ÿง

Why?Why?

The short answer is: To save the symbols and accelerate the treatment! โฉ

It is time to dig deeper! But first, letโ€™s take a look at another approach used by artificial intelligence agents to reduce data download. ๐Ÿ‘€

From excessive data to clarity: The first step for artificial intelligence customers

Imagine that you want your artificial intelligence agent:

  1. Contact the e -commerce site (for example Amazon)

  2. Find a product (for example PlayStation 5)

  3. Extract data from this specific product page

This is a common scenario for the artificial intelligence agent, as the renewal of e -commerce is a wild trip. After all, the product pages are chaos chaos for constantly changing changes, making software data a nightmare analysis. This is the place where the factors of artificial intelligence praise their great powers, and benefit from LLMS to extract data smoothly โ€“ it does not matter how the page structure is chasing!

Now, letโ€™s say that you are on a mission to seize all the modern details of PlayStation 5 page page On the Amazon ๐ŸŽฎ:

PlayStation 5 amazon pagePlayStation 5 amazon page

Hereโ€™s how to drive AIโ€™s agent browser to achieve this:

Navigate to Amazon's homepage. Search for 'PlayStation 5' and select the top result. 
Extract the product title, price, availability, and customer ratings. 
Return the data in a structured JSON format.

This is what the artificial intelligence agent should do (hope ๐Ÿคž):

  1. Open the Amazon in the browser ๐ŸŒ

  2. Look for โ€œPlayStation 5โ€ ๐Ÿ”

  3. Determine the correct product ๐ŸŽฏ

  4. Extracting the product details from the page and returning it in JSON ๐Ÿ“„

But this is the real challenge โ€“Step 4. Amazon PlayStation 5 page is a monster! HTML is full of tons of information, which you donโ€™t even need.

Want a guide? Full HTML copy of the page from the DOM page to the browser and drop it into a tool like the distinctive code calculator LLM:

The result of the distinctive symbol Calculator.netThe result of the distinctive symbol Calculator.net

๐Ÿšจ Take yourself โ€ฆ

896,871 icon!896,871 icon!

896,871 symbols?! ๐Ÿ˜ฑ Yes, I have read this correctly-ninety-six thousand, and eight hundred and seventy of the distinctive symbols!

This huge load of data โ€“ AKA tons of money! ๐Ÿ’ธ (more than $ 2 per request on GPT-4O! ๐Ÿ˜ฌ)

Listen to Joe Bastanch ...Listen to Joe Bastanch ...

As you can imagine, passing all these data to the artificial intelligence agent comes with great restrictions:

  1. You may require installment/supporter plans to support the use of a highly distinctive symbol ๐Ÿ’ฐ
  2. It costs a fortune โ€“ especially if you are running frequent inquiries ๐Ÿค‘
  3. It slows down responses because artificial intelligence must address a ridiculous amount of information โณ

Reform: fat trimming

Most artificial intelligence agents allow you to determine the CSS specific to extract the relevant departments only from the web page. Others use the algorithms of affairs for the automatic candidate content-such as stripping heads and appetite (which usually does not add any value). โœ‚

For example, if you check the Amazon PlayStation 5 product page, you will notice that most useful content lives within the HTML element that you have selected #ppd CSS specific:

#PD HTML element#PD HTML element

Now, what if you told the artificial intelligence agent only focus on #ppd An element instead of the entire page? Will this make a difference? ๐Ÿค”

Letโ€™s put it on the test in the confrontation face to the face below! ๐Ÿ”ฅ

Markdown Vs HTML in the processing of artificial intelligence data: comparison face to face

Compare the use of the distinctive symbol when processing a portion of the web page directly in exchange for converting it into a reduction.

Html

In your browser, copy HTML from #ppd The element, and drop it into a symbolic calculator:

309,951 icon, this time309,951 icon, this time

From 896,871 symbols to only 309,951 โ€“Nearly 65 % save!

This is a significant decrease, for sure, but letโ€™s be real โ€“ still a lot of distinctive symbols! ๐Ÿ˜ตโ€๐Ÿ’ธ

Price reduction

Now, letโ€™s repeat the trick used by artificial intelligence agents by taking advantage of the HTML converting tool to Markdown online. But first, remember that artificial intelligence agents perform some pre -treatment to remove important signs of content such as and and tags const scriptRegex = /

Next, copy HTML, which was cleaned and converted into discounts using the HTML converter to Markdown online:

HTML to reduceHTML to reduce

The resulting reduction is much smaller but It still contains all important text data!

amazing!amazing!

Now, paste this reduction in the distinctive LLM code tool:

7,943 codes!7,943 codes!

Boom! ๐Ÿ’ฃ From 896,871 symbols to 7,943 symbols only. These are savings ~ 99 %!

What a result of the bombing of the mind!What a result of the bombing of the mind!

By removing the basic content only and converting HTML to Markdown, you have a more size load, lower costs, and faster processing. Great victory! ๐Ÿ’ฐ

Markdown vs html: The Battle for Tokens and Cost Saves

The last step is to check that the text text still contains all the main data. To do this, pass it to LLM with the last part of the original claim, and here is the result of Json you will get:

{
  "product_title": "PlayStationยฎ5 console (slim)",
  "price": "$499.00",
  "availability": "In stock",
  "customer_ratings": {
    "rating": 4.6,
    "total_ratings": 5814
  }
}

This is exactly what your artificial intelligence agent will return - Spot on!

For a quick overview, check the final summary below:

road

Symbols

The price of O1-Mini

GPT-4O-MINI Price

GPT-4O price

HTML is the entire

896,871

13.4531 dollars

0.1345 dollars

$ 2.2422

#ppd Html

309,951

$ 4.6493

0.0465 dollars

0.7749 dollars

#ppd Price reduction

7,943

0.0596 dollars

0.0012 dollars

0.0199 dollars

Where the agents of artificial intelligence fail

All of these symbolic tricks are useless if the artificial intelligence agent is banned by the target site ๐Ÿ˜… (I saw ever how AI captcha be honey? ๐Ÿคฃ).

Why does this happen? basic! Most sites use anti -deployment measures that can easily prevent browsers. Do you want a complete collapse? Watch the next web symposium below:

If you have followed our advanced webs guide, then you know that the problem is not with browser automation tools (libraries that operate artificial intelligence agents). no, The real perpetrator is the browser himself. ๐Ÿค–

To avoid a ban, you need a specially designed browser for cloud automation. Enter the scraping browser, browser:

  • It is operated in the position of the head just like the regular browser, which makes it difficult for the anti -detect systems. ๐Ÿ”
  • It is closer to the cloud, which saves you time and money on the infrastructure. ๐Ÿ’ฐ
  • Captcha automatically replaces, deals with the browser's fingerprint, dedicates cookies/heads, and tries to retry matters smoothly. โšก
  • IPS rotates from one of the largest and most reliable agent networks there. ๐ŸŒ
  • Smoothly integrates with famous automation libraries such as theatrical writer, selenium and dolls. ๐Ÿ”ง

Learn more about the Bright Data Data browser, The perfect tool for integration into artificial intelligence agents:

Final ideas

You are now in the episode about the reason for the use of artificial intelligence agents to reduce data processing. It is a simple trick to provide symbols (and money) while speeding up LLM processing.

Do you want to work artificial intelligence without hitting the blocks? Take a look at the BRIGHT DATA tool collection for AI! Join us to make the Internet accessible to everyone - even through the browsers of the automatic artificial intelligence agent. ๐ŸŒ

Even next time, keep the web freezer! ๐Ÿ„โ€

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button