Thank you for replies (I acknowledge "use case" is quite meaningless...)
Below is the profile output with the top 20 operations that take the most time. If I understand well the functions related to openpyxl.descriptors.serialisable, openpyxl.descriptors.base, and the underlying xml.etree.ElementTree are to blame.
| **Count** | **Time (s)** | **CPU Time (s)** | **Wall Time (s)** | **Location** |
|--------------------|--------------|------------------|-------------------|------------------------------------------------------------------------------|
| 3444787/37649 | 42.943 | 0.000 | 140.063 | C:\Python311\Lib\site-packages\openpyxl\descriptors\serialisable.py:46(from_tree) |
| 21477798 | 23.222 | 0.000 | 33.982 | C:\Python311\Lib\site-packages\openpyxl\descriptors\base.py:39(__set__) |
| 5231626 | 11.335 | 0.000 | 21.015 | C:\Python311\Lib\xml\etree\ElementTree.py:1637(_start) |
| 9443777/9443749 | 10.466 | 0.000 | 28.238 | C:\Python311\Lib\site-packages\openpyxl\descriptors\base.py:68(__set__) |
| 57816954/57816952 | 9.624 | 0.000 | 9.624 | {built-in method builtins.isinstance} |
| 8625 | 8.404 | 0.001 | 39.484 | {method 'Parse' of 'pyexpat.xmlparser' objects} |
| 4846463 | 8.148 | 0.000 | 14.491 | C:\Python311\Lib\site-packages\openpyxl\xml\functions.py:77(localname) |
| 23870387 | 7.710 | 0.000 | 7.710 | C:\Python311\Lib\site-packages\openpyxl\descriptors\base.py:24(__set__) |
| 2528893 | 7.338 | 0.000 | 15.009 | C:\Python311\Lib\site-packages\openpyxl\worksheet\_reader.py:189(parse_cell) |
| 5231626 | 6.569 | 0.000 | 6.569 | {method 'start' of 'xml.etree.ElementTree.TreeBuilder' objects} |
| 90 | 5.383 | 0.060 | 81.916 | C:\Python311\Lib\site-packages\openpyxl\worksheet\_reader.py:367(bind_cells) |
| 5065495 | 5.338 | 0.000 | 20.358 | C:\Python311\Lib\site-packages\openpyxl\descriptors\base.py:164(__set__) |
| 2528893 | 4.229 | 0.000 | 4.608 | C:\Python311\Lib\site-packages\openpyxl\utils\cell.py:193(coordinate_to_tuple)|
| 11495787/11463855 | 4.204 | 0.000 | 4.314 | {built-in method builtins.getattr} |
| 4865685 | 4.180 | 0.000 | 4.180 | {method 'match' of 're.Pattern' objects} |
| 16243267 | 3.974 | 0.000 | 3.974 | C:\Python311\Lib\xml\etree\ElementTree.py:1620(_fixname) |
| 5240342 | 3.767 | 0.000 | 5.163 | C:\Python311\Lib\xml\etree\ElementTree.py:1310(read_events) |
| 5231626 | 3.762 | 0.000 | 5.785 | C:\Python311\Lib\xml\etree\ElementTree.py:1649(_end) |
| 5231626 | 3.577 | 0.000 | 10.064 | C:\Python311\Lib\xml\etree\ElementTree.py:1578(handler) |
| 5529298/5528595 | 3.324 | 0.000 | 4.194 | C:\Python311\Lib\site-packages\openpyxl\descriptors\base.py:53(_convert) |
I couldn't find alternatives to openpyxl (the extension xlsm of my Excel and the for free nature narrow down options dramatically).
Maybe to be more specific about what I'm willing to achieve. I have an Excel range, and the cells in that range can have one of 4 possible background colors (as seen by the user on his/her screen). And I'm interested in returning the column/row of cells with lightred background. In more technical terms these lightred cells are distinguishable from the 3 other background colors by either (or both) theme and schema as determined as follows
# Load the workbook and select the sheet
workbook = openpyxl.load_workbook(filename)
sheet = workbook['Sheet1']
# Get the cell
cell = sheet['P95']
# Get the fill pattern of the cell
fill = cell.fill
# Get the cell's theme and schema (if applicable)
theme = cell.fill.start_color.theme if cell.fill.start_color.theme else None # 5
schema = cell.fill.start_color.tint if cell.fill.start_color.tint else None # 0.7999511703848384
No other background color has a theme == 5 and/or a schema == 0.7999511703848384
And I'm looking for the right tools to do that super force for a whole range of Excel cells.
xlwings would be way too slow based on my experience. And i really need execution speed to go down to 2-3s.
(I acknowledge relying on Excel cell background colors can be disastrous! Heopfully what i meant with "background color" is more clear now)