update_prices.ipynb¶
This notebook guides through the action of updating the prices of a database, i.e. adding records in the price table with the prices registered on the current day.
The notebook is divided in two parts:
In the first part we create a minimal database with a single book. We save it as
database.dbin the current folder.In the second part we load the database
database.dband we read it as apolars.DataFrame. We then read all the rows of the database and scrape the prices for any link we find. The prices are then added to the tablepricesin the database with the current date.
from datetime import datetime
import bookhoarder as bh
%opts magic unavailable (pyparsing cannot be imported)
%compositor magic unavailable (pyparsing cannot be imported)
We load the database database.db in the current folder, which is read as a polars.DataFrame with a specific schema (specified by mbp.schema())
books = bh.read_database(
table_name="books",
url="sqlite:///database.db",
schema=bh.schema(),
)
books
| isbn | author | title | year | publisher | language | format | full_price | min_price | adelphi | buecher | feltrinelli | hoepli | hugendubel | ibs | libcoop | libraccio | libuni | mondadori | osiander | rizzoli |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | str | str | str | str | str | str | f64 | f64 | str | str | str | str | str | str | str | str | str | str | str | str |
| "9783866473256" | "Karl Marx" | "Das Kapital" | "2009" | "Anaconda" | null | null | 7.95 | 7.95 | null | "https://www.buecher.de/artikel… | null | null | null | null | null | null | null | null | "https://www.osiander.de/shop/h… | null |
| "9781857988826" | "Ursula K. Le Guin" | "The Dispossessed" | "1999" | "Orion Publishing Co" | null | null | 13.0 | 13.0 | null | "https://www.buecher.de/artikel… | "https://www.lafeltrinelli.it/d… | null | null | "https://www.ibs.it/dispossesse… | null | null | null | null | "https://www.osiander.de/shop/h… | null |
As a first step, we define empty lists, we store today’s date, and the book’s isbn.
list_isbns = []
list_sites = []
list_prices = []
list_dates = []
date = datetime.today().date()
Notice that we need to keep in mind that for each book we do not have links for all webistes, for example we only have the following websites for the first book:
[s.name for s in books[0][bh.list_sites()] if not s.null_count()]
['buecher', 'osiander']
Then, the following function extracts the following lists:
|
|
|---|---|
isbn_1 |
link_1 for book_1 |
isbn_1 |
link_2 for book_1 |
isbn_1 |
link_3 for book_1 |
isbn_2 |
link_1 for book_2 |
isbn_2 |
link_2 for book_2 |
etc.
prices = bh.scrape_database(books)
prices
100%|██████████| 6/6 [00:04<00:00, 1.22it/s]
| isbn | site | price | date |
|---|---|---|---|
| str | str | f64 | date |
| "9783866473256" | "buecher" | 7.95 | 2025-06-12 |
| "9783866473256" | "osiander" | 7.95 | 2025-06-12 |
| "9781857988826" | "buecher" | 9.99 | 2025-06-12 |
| "9781857988826" | "feltrinelli" | 13.74 | 2025-06-12 |
| "9781857988826" | "ibs" | 13.05 | 2025-06-12 |
| "9781857988826" | "osiander" | 10.99 | 2025-06-12 |
We create an updated_database.db to store the updated values.
bh.write_database(
books,
table_name="books",
url="sqlite:///updated_database.db",
if_table_exists="replace",
)
bh.write_database(
prices,
table_name="prices",
url="sqlite:///updated_database.db",
if_table_exists="replace",
)
After scraping the prices, we update the minimum (min_price) in the books table.
bh.update_min_whole_database(url="sqlite:///updated_database.db")
bh.read_database(
table_name="books",
url="sqlite:///updated_database.db",
schema=bh.schema(),
)
| isbn | author | title | year | publisher | language | format | full_price | min_price | adelphi | buecher | feltrinelli | hoepli | hugendubel | ibs | libcoop | libraccio | libuni | mondadori | osiander | rizzoli |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | str | str | str | str | str | str | f64 | f64 | str | str | str | str | str | str | str | str | str | str | str | str |
| "9783866473256" | "Karl Marx" | "Das Kapital" | "2009" | "Anaconda" | null | null | 7.95 | 7.95 | null | "https://www.buecher.de/artikel… | null | null | null | null | null | null | null | null | "https://www.osiander.de/shop/h… | null |
| "9781857988826" | "Ursula K. Le Guin" | "The Dispossessed" | "1999" | "Orion Publishing Co" | null | null | 13.0 | 9.99 | null | "https://www.buecher.de/artikel… | "https://www.lafeltrinelli.it/d… | null | null | "https://www.ibs.it/dispossesse… | null | null | null | null | "https://www.osiander.de/shop/h… | null |