A few thingz


Joseph Basquin


25/04/2025

A la recherche d'un rendu de parquet chêne (16 ans plus tard...)

Lorsque j'habitais Nancy il y a 16 ans, en visitant divers apparts à louer (en passant : l'architecture des apparts début XXème est incroyablement belle à Nancy), j'ai vu ce parquet chêne massif :

Depuis je ne cesse d'y repenser, et j'ai recherché cela dans mes appartements succesifs. Actuellement en rénovation de maison, où un parquet chêne va prochainement être poncé, je vais tout naturellement chercher à reproduire cet effet.

Mais beaucoup d'obstacles se présentent :

Par ailleurs les recherches Google sur le sujet mènent à des quantités de produits de boutiques en ligne, avec des noms enchanteurs du type "Vitrificateur patine ancienne effet vieilli", mais au final il est impossible de tout tester (il faudrait acheter des dizaines de produits), ni de les comparer entre eux car la composition des produits n'est pas indiquée (en informatique on dirait : pas open source !).

Par ailleurs, des tutos Youtube montrent des astuces pour teinter le bois avec des sachets de thé ou un mélange vinaigre+paille de fer. Certes. Cela marche peut-être. Mais raisonnablement, utilisaient-ils cela à grande échelle dans les années 1910 lorsqu'étaient posés des parquets dans des centaines de mètres carrés d'appartements ? A coup de dizaines de kilogrammes de thé ou de litres de mélange vinaigre + paille de fer ? J'en doute. (Si vous avez des infos, je suis preneur)

Reprenons : ce parquet que j'aimais tant dans ce vieil appartement a sans doute été teinté début XXème siècle, donc tant qu'à faire, autant chercher dans des produits disponibles à l'époque. Exit donc les vitrificateurs modernes, etc. Ce qui limite le choix. Mais quelle recette précise utilisaient-ils à l'époque ? Idée : pourquoi ne pas utiliser Gallica, fonds documentaire numérisé de la Bibliothèque Nationale de France ?

Et je finis par trouver cela dans La Revue de l'habitation, Ma petite maison, (Paris), 1908 :

Egalement ici, dans Au Bon Marché, Catalogue. Ménage, orfèvrerie, services de table, articles de jardin, outillage, entretien, 1926 :

Par ailleurs, dans Nouvelle encyclopédie pratique du bâtiment et de l'habitation. Volume 9, René Champly :

Extrait 1 : CHAPITRE XII ENCAUSTIQUES, BROU DE NOIX Les encaustiques sont des dissolutions de cire que l'on applique sur les parquets ou boiseries destinées à être cirées(...)Après que l'encaustique est sèche, il suffit de frotter avec un chiffon de laine ou avec une brosse à parquet pour obtenir une surface lustrée(...)l'encaustique ne doit être appliquée que sur des surfaces propres et sèches(...)

Et également dans Technologie de l'employé d'hôtel / par A. Fabre, E. Guiard, 1921-1926:

Voir aussi L'Entrepreneur de peinture en bâtiment, traité par F. Nimbeau, concernant peinture, vitrerie, dorure, miroiterie, encadrement, tenture, 1894.

Voilà la solution !

Après quelques tests (photos à venir un jour) concernant le dosage respectif des constituants, j'arrive enfin à trouver le résultat espéré.

PROBLEM SOLVED, 16 YEARS LATER !

Voici la règle que je viens de formaliser après 40 ans (mais que j'applique inconsciemment depuis longtemps) :

“Quand on veut un résultat précis, avec une attention particulière au détail, rien ne sert de chercher un produit "on the shelf" ("disponible sur étagère" en magasin) pour le faire : cela ne conduira qu'à des déceptions, de n'avoir au final pas exactement le rendu recherché. Au lieu de cela, il faut tout simplement fabriquer le produit soi-même !”
(et au préalable : chercher les constituants de base les plus simples possibles, puis expérimenter)


About me: I am Joseph Basquin, maths PhD. I create products such as SamplerBox, YellowNoiseAudio, Jeux d'orgues, this blogging engine...
I do freelancing: Software product design / Python / R&D / Automation / Embedded / Audio / Data / UX / MVP. Send me an email.

The Content Overflow Era – the end of the Long Tail?

What follows might be trivial by now, but it is always good to put a word on it. I'm speaking about media content in general: books, music, website articles, soon videos, and so on.

Here is what the "Long Tail" is now evolving into (see Period #2 if you're unfamiliar):


Period #1 – Pre-Internet era

Limited published content, for at least these reasons:

 

Period #2 – The Long Tail 2000-2022

The Long Tail concept has been popularized by Chris Anderson (2004, 2006). Notable aspects:

Consequence: at this period in time, it was possible for a human producing original content (that arrived in the long tail) to exist as creator, to get its content read/listened to, by other humans. This also led to economic viability of niche products for (some) creators.

 

Period #3 – Content Overflow 2022-?

Consequence for small creators: Humans creating content, but which are not in the top celebrities, will have increasing difficulties to get their content read/listened to by other humans, because they will be in the same too-long tail than AI-generated content.

 

Possible outcomes

 


About me: I am Joseph Basquin, maths PhD. I create products such as SamplerBox, YellowNoiseAudio, Jeux d'orgues, this blogging engine...
I do freelancing: Software product design / Python / R&D / Automation / Embedded / Audio / Data / UX / MVP. Send me an email.

The best browser bookmarking system is already built-in

(Thanks for getting this on front page of HackerNews!)

 

Along the years I have tested various browser bookmarking systems.

And for 10 years, I've realised that the best bookmarking system is already built-in to most browsers (Firefox, Chrome), and that is:

files!

See the video, the drag-and-drop creates a .url shortcut file:

Just that. You drag and drop the URL to the Desktop or any folder.

No browser extension needed.

That's it! Probably the best bookmark manager ever.

This has worked for decades with Firefox, Chrome, Internet Explorer, and probably others, at least on Windows.

Remarks:

 


About me: I am Joseph Basquin, maths PhD. I create products such as SamplerBox, YellowNoiseAudio, Jeux d'orgues, this blogging engine, etc. I do freelancing (remote ok): Python expert / R&D / Automation / Embedded / Audio / Data / UX / MVP. Send me an email.

The definitive guide to Ableton Live's delay compensation and reduced latency

(A small "Note to self" post, while producing music)

Two options are often misunderstood in Ableton Live: "Delay compensation" and "Reduced latency when monitoring". Even after watching many tutorials, blog articles, videos about this, it might still be unclear.

The best way to clearly understand what is going on is to do this small experiment (do it, it takes 2 minutes and you'll understand this once for all!):

Options > Delay Compensation

Options > Reduced Latency When Monitoring

"Keep Latency" buttons

This new Ableton Live 12 option seems pretty interesting, I haven't tested it yet. To be written.

Buffer Size, Input Latency, Output Latency, Driver Error Compensation, Overall Latency

This is documented everywhere, so I won't reexplain this in detail. About Driver Error Compensation, Overall Latency, see end of next paragraph.

Note on monitoring IN / AUTO / OFF setting

 

Why do I always have cables everywhere?

Working on PDF files with Python

There are many solutions to work on PDF files with Python. Depending on whether you need to read, parse data, extract tables, modify (split, merge, crop...), or create a new PDF, you will need different tools.

Here is a quick diagram of some common tools I have used:

If you need to extract data from image PDF files, it's a whole different story, and you might need to use OCR libraries like (Py)Tesseract or other tools.

Have some specific data conversion / extraction needs? Please contact me for consulting - a little script can probably automate hours of manual processing in a few seconds!

N-dimensional array data store (with labeled indexing)

What am I trying to do?

I'm currently looking for the perfect data structure for an ongoing R&D task.

I need to work with a data store as a n-dimensional array x (of dimension 4 or more) such that:

Possible solutions

I'm looking for a good and lightweight solution.
To keep things simple, I deliberately avoid (for now):

method ragged non-consecutive indexing numpy arithm. random access for 100 GB data store notes
xarray ? no
sparse ? no
Pandas DataFrame + Numpy ndarray ? ? (*) (**)
Tensorflow tf.ragged.constant ? ? ?
Sqlite + Numpy ndarray ? ? ? ? to be tested

(*) serialization with parquet: doesn't accept 2D or 3D arrays:

import numpy as np, pandas as pd
x = pd.DataFrame(columns=['a', 'b'])
for i in range(100):
    x.loc['t%i' % i] = [np.random.rand(100, 100), np.random.rand(2000)]
x.to_parquet('test.parquet')
# pyarrow.lib.ArrowInvalid: ('Can only convert 1-dimensional array values', 'Conversion failed for column a with type object')

(**) serialization with hdf5: currently not working:

import numpy as np, pandas as pd
store = pd.HDFStore("store.h5")
df = pd.DataFrame(columns=['a', 'b'])
df.loc['t1'] = {'a': np.random.rand(100, 100), 'b': np.random.rand(2000)}
store.append('test', df)
store.close()
# TypeError: Cannot serialize the column [a] because its data contents are not [string] but [mixed] object dtype

Contact me if you have ideas!

Links

Data structure for n-dimensional array / tensor such A[0, :, :] and A[1, :, :] can have different shapes
Pandas rows containing numpy ndarrays various shapes
Pandas Dataframe containing Numpy ndarray and mean
100GB data store: Pandas dataframe of numpy ndarrays: loading only a small part + avoid rewriting the whole file when doing small modifications

Python + TensorFlow + GPU + CUDA + CUDNN setup with Ubuntu

Every time I setup Python + TensorFlow on a new machine with a fresh Ubuntu install, I have to spend some time again and again on this topic, and do some trial and error (yes I'm speaking about such issues). So here is a little HOWTO, once for all.

Important fact: we need to install the specific version number of CUDA and CUDNN relative to a particular version of TensorFlow, otherwise it will fail, with errors like libcudnn.so.7: cannot open shared object file: No such file or directory.

For example, for TensorFlow 2.3, we have to use CUDA 10.1 and CUDNN 7.6 (see here).

Here is how to install on a Ubuntu 18.04:

pip3 install --upgrade pip   # it was mandatory to upgrade for me
pip3 install keras tensorflow==2.3.0

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt install cuda-10-1 nvidia-driver-430

To test if the NVIDIA driver is properly installed, you can run nvidia-smi (I noticed a reboot was necessary).

Then download "Download cuDNN v7.6.5 (November 5th, 2019), for CUDA 10.1" on https://developer.nvidia.com/rdp/cudnn-archive (you need to create an account there), and then:

sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb     

That's it! Reboot the computer, launch Python 3 and do:

import tensorflow
tensorflow.test.gpu_device_name()     # also, tensorflow.test.is_gpu_available() should give True

The last line should display the right GPU device name. If you get an empty string instead, it means your GPU isn't used by TensorFlow!

Notes:

Quick-tip: Rebooter une Livebox avec un script Python

Petite astuce utile pour rebooter une Livebox Play en 4 lignes de code :

import requests
r = requests.post("http://192.168.1.1/authenticate?username=admin&password=LEMOTDEPASSEICI")
h = {'Content-Type': 'application/json; charset=UTF-8', 'X-Context': r.json()['data']['contextID']}
s = requests.post("http://192.168.1.1/sysbus/NMC:reboot", headers=h, cookies=r.cookies)

Avec une Livebox 4 ou 5, voici la méthode :

import requests
session = requests.Session()
auth = '{"service":"sah.Device.Information","method":"createContext","parameters":{"applicationName":"so_sdkut","username":"admin","password":"LEMOTDEPASSEICI"}}'
r = session.post('http://192.168.1.1/ws', data=auth, headers={'Content-Type': 'application/x-sah-ws-1-call+json', 'Authorization': 'X-Sah-Login'})
h = {'X-Context': r.json()['data']['contextID'], 'X-Prototype-Version': '1.7', 'Content-Type': 'application/x-sah-ws-1-call+json; charset=UTF-8', 'Accept': 'text/javascript'}
s = session.post("http://192.168.1.1/sysbus/NMC:reboot", headers=h, data='{"parameters":{}}')
print(s.json())

Inspiré de ce post avec curl, de ce projet (la même chose en ... 99 lignes de code ;)) et la librairie sysbus.

NB: cette méthode de reboot change l'IP de la Livebox au redémarrage.

"Since"

A song I made a few months ago.

Join/Leave · Since

nFreezer, a secure remote backup tool

So you make backups of your sensitive data on a remote server. How to be sure that it is really safe on the destination server?

By safe, I mean "safe even if a malicious user gains access" on the destination server; here we're looking for a solution such that, even if a hacker attacks your server (and installs compromised software on it), they cannot read your data.

You might think that using SFTP/SSH (and/or rsync, or sync programs) and using an encrypted filesystem on the server is enough. In fact, no: there will be a short time during which the data will be processed unencrypted on the remote server (at the output of the SSH layer, and before arriving at the filesystem encryption layer).

How to solve this problem? By using an encrypted-at-rest backup program: the data is encrypted locally, and is never decrypted on the remote server.

I created nFreezer for this purpose.

Main features:

More about this on nFreezer.




  By the way I just published another (local) backup tool on PyPi: backupdisk, that you can install with pip install diskbackup. It allows you to quickly backup your disk to an external USB HDD in one-line:

diskbackup.backup(src=r'D:\Documents', dest=r'I:\Documents', exclude=['.mp4'])




Update: many thanks to @Korben for his article nFreezer – De la sauvegarde chiffrée de bout en bout (December 12, 2020).

Older articles

My blog – Joseph Basquin

twitter
email
github
linkedin
freelancing

Available for freelancing: Python expert / R&D / Automation / Embedded / Audio / Data / UX

I create products such as SamplerBox, YellowNoiseAudio, Jeux d'orgues, this blogging engine, etc.

Articles about: #all, #music, #opensource, #python.