@@ -23,6 +23,7 @@ properly extract data for some websites.
2323 with today's websites which relies on a lot of page interactions to display
2424 its contents.
2525
26+ .. _`httprequest-example` :
2627
2728HttpRequest
2829===========
@@ -263,21 +264,67 @@ additional requests asynchronously using ``asyncio.gather()``, ``asyncio.wait()`
263264etc. This means that `` asyncio`` could be used anywhere inside the Page Object,
264265including the `` to_item()`` method.
265266
266- In the previous section, we' ve explored how :class:`~.HttpRequest` are defined.
267- Fortunately, the :meth:`~ .HttpClient.request` , :meth:`~ .HttpClient.get` , and
268- :meth:`~ .HttpClient.post` methods of :class :`~ .HttpClient` already defines the
269- :class :`~ .HttpRequest` and executes it as well. The only time you' ll need to create
270- :class :`~ .HttpRequest` manually is via the :meth:`~ .HttpClient.batch_requests`
271- method which is described in this section: :ref:`http- batch- request- example` .
272-
267+ In the previous section, we' ve explored how :class:`~.HttpRequest` is defined.
273268Let' s see a few quick examples to see how to execute additional requests using
274269the :class :`~ .HttpClient` .
275270
271+ Executing a HttpRequest instance
272+ --------------------------------
273+
274+ .. code- block:: python
275+
276+ import attrs
277+ import web_poet
278+
279+
280+ @ attrs.define
281+ class ProductPage(web_poet.ItemWebPage):
282+ http_client: web_poet.HttpClient
283+
284+ async def to_item(self ):
285+ item = {
286+ " url" : self .url,
287+ " name" : self .css(" #main h3.name ::text" ).get(),
288+ " product_id" : self .css(" #product ::attr(product-id)" ).get(),
289+ }
290+
291+ # Simulates clicking on a button that says "View All Images"
292+ request = web_poet.HttpRequest(f " https://api.example.com/v2/images?id= { item[' product_id' ]} " )
293+ response: web_poet.HttpResponse = await self .http_client.execute(request)
294+
295+ item[" images" ] = response.css(" .product-images img::attr(src)" ).getall()
296+ return item
297+
298+ As the example suggests, we' re performing an additional request that allows us
299+ to extract more images in a product page that might not otherwise be possible.
300+ This is because in order to do so, an additional button needs to be clicked
301+ which fetches the complete set of product images via AJAX .
302+
303+ There are a few things to take note of this example:
304+
305+ * Recall from the :ref:`httprequest- example` tutorial section that the
306+ default method is `` GET `` .
307+ * We' re now using the ``async/await`` syntax inside the ``to_item()`` method.
308+ * The response from the additional request is of type :class :`~ .HttpResponse` .
309+
310+ .. tip::
311+
312+ See the :ref:`http- batch- request- example` tutorial section to see how to
313+ execute a group of :class :`~ .HttpRequest` in batch.
314+
315+ Fortunately, there are already some quick shortcuts on how to perform single
316+ additional requests using the :meth:`~ .HttpClient.request` , :meth:`~ .HttpClient.get` ,
317+ and :meth:`~ .HttpClient.post` methods of :class :`~ .HttpClient` . These already
318+ define the :class :`~ .HttpRequest` and executes it as well.
319+
276320.. _`httpclient- get- example` :
277321
278322A simple `` GET `` request
279323------------------------
280324
325+ Let' s use the example from the previous section and use the :meth:`~.HttpClient.get`
326+ method on it.
327+
281328.. code- block:: python
282329
283330 import attrs
@@ -306,13 +353,8 @@ There are a few things to take note in this example:
306353
307354 * A `` GET `` request can be done via :class :`~ .HttpClient` ' s
308355 :meth:`~ .HttpClient.get` method.
309- * We' re now using the ``async/await`` syntax inside the ``to_item()`` method.
310- * The response from the additional request is of type :class :`~ .HttpResponse` .
311-
312- As the example suggests, we' re performing an additional request that allows us
313- to extract more images in a product page that might not otherwise be possible.
314- This is because in order to do so, an additional button needs to be clicked
315- which fetches the complete set of product images via AJAX .
356+ * There was no need to instantiate a :class :`~ .HttpRequest` since :meth:`~ .HttpClient.get`
357+ already handles it before executing the request.
316358
317359.. _`request- post- example` :
318360
@@ -378,16 +420,17 @@ Batch requests
378420--------------
379421
380422We can also choose to process requests by ** batch** instead of sequentially or
381- one by one. The :meth:`~ .HttpClient.batch_requests` method can be used for this
382- which accepts an arbitrary number of :class :`~ .HttpRequest` instances.
423+ one by one (e.g. using :meth:`~ .HttpClient.execute` ). The :meth:`~ .HttpClient.batch_execute`
424+ method can be used for this which accepts an arbitrary number of :class :`~ .HttpRequest`
425+ instances.
383426
384427Let' s modify the example in the previous section to see how it can be done.
385428
386429The difference for this code example from the previous section is that we' re
387430increasing the pagination from only the ** 2nd page** into the ** 10th page** .
388431Instead of calling a single :meth:`~ .HttpClient.post` method, we' re creating a
389432list of :class :`~ .HttpRequest` to be executed in batch using the
390- :meth:`~ .HttpClient.batch_requests ` method.
433+ :meth:`~ .HttpClient.batch_execute ` method.
391434
392435.. code- block:: python
393436
@@ -415,7 +458,7 @@ list of :class:`~.HttpRequest` to be executed in batch using the
415458 self .create_request(item[" product_id" ], page_num = page_num)
416459 for page_num in range (2 , self .default_pagination_limit)
417460 ]
418- responses: List[web_poet.HttpResponse] = await self .http_client.batch_requests (* requests)
461+ responses: List[web_poet.HttpResponse] = await self .http_client.batch_execute (* requests)
419462 related_product_ids = [
420463 id_
421464 for response in responses
@@ -452,12 +495,12 @@ The key takeaways for this example are:
452495 It only contains the HTTP Request information for now and isn' t executed yet.
453496 This is useful for creating factory methods to help create requests without any
454497 download execution at all .
455- * :class :`~ .HttpClient` has a :meth:`~ .HttpClient.batch_requests ` method that
498+ * :class :`~ .HttpClient` has a :meth:`~ .HttpClient.batch_execute ` method that
456499 can process a list of :class :`~ .HttpRequest` instances asynchronously together.
457500
458501.. tip::
459502
460- The :meth:`~ .HttpClient.batch_requests ` method can accept different varieties
503+ The :meth:`~ .HttpClient.batch_execute ` method can accept different varieties
461504 of :class :`~ .HttpRequest` that might not be related with one another. For
462505 example, it could be a mixture of `` GET `` and `` POST `` requests or even
463506 representing requests for various parts of the page altogether.
@@ -466,7 +509,7 @@ The key takeaways for this example are:
466509 of async execution which could be faster in certain cases `(assuming you' re
467510 allowed to perform HTTP requests in parallel)` .
468511
469- Nonetheless, you can still use the :meth:`~ .HttpClient.batch_requests ` method
512+ Nonetheless, you can still use the :meth:`~ .HttpClient.batch_execute ` method
470513 to execute a single :class :`~ .HttpRequest` instance.
471514
472515
@@ -566,7 +609,7 @@ For this example, let's improve the code snippet from the previous subsection na
566609 ]
567610
568611 try :
569- responses: List[web_poet.HttpResponse] = await self .http_client.batch_requests (* requests)
612+ responses: List[web_poet.HttpResponse] = await self .http_client.batch_execute (* requests)
570613 except web_poet.exceptions.HttpRequestError:
571614 logger.warning(
572615 f " Unable to request for more related products for product ID: { item[' product_id' ]} "
@@ -605,17 +648,17 @@ For this example, let's improve the code snippet from the previous subsection na
605648 def parse_related_product_ids(response_page) -> List[str ]:
606649 return response_page.css(" #main .related-products ::attr(product-id)" ).getall()
607650
608- Handling exceptions using :meth:`~ .HttpClient.batch_requests ` remains largely the same.
651+ Handling exceptions using :meth:`~ .HttpClient.batch_execute ` remains largely the same.
609652However, the main difference is that you might be wasting perfectly good responses just
610653because a single request from the batch ruined it.
611654
612655An alternative approach would be salvaging good responses altogether. For example, you' ve
613656sent out 10 :class :`~ .HttpRequest` and only 1 of them had an exception during processing.
614657You can still get the data from 9 of the :class :`~ .HttpResponse` by passing the parameter
615- `` return_exceptions = True `` to :meth:`~ .HttpClient.batch_requests ` .
658+ `` return_exceptions = True `` to :meth:`~ .HttpClient.batch_execute ` .
616659
617660This means that any exceptions raised during request execution are returned alongside any
618- of the successful responses. The return type of :meth:`~ .HttpClient.batch_requests ` could
661+ of the successful responses. The return type of :meth:`~ .HttpClient.batch_execute ` could
619662be a mixture of :class :`~ .HttpResponse` and :class :`web_poet.exceptions.http.HttpRequestError` .
620663
621664Here' s an example:
@@ -625,7 +668,7 @@ Here's an example:
625668 # Revised code snippet from the to_item() method
626669
627670 responses: List[Union[web_poet.HttpResponse, web_poet.exceptions.HttpRequestError]] = (
628- await self .http_client.batch_requests (* requests, return_exceptions = True )
671+ await self .http_client.batch_execute (* requests, return_exceptions = True )
629672 )
630673
631674 related_product_ids = []
0 commit comments