Thank you for Python4Capella!
It is pretty easy with Python4Capella to export an Excel file with a list of elements and their descriptions. However, what we retrieve in Excel is the XHTML markup with all the tags, and not the description as we see it in Capella description widgets (with formating applied).
What I need is something like that:
This need seems pretty basic. You could tell me this is something that is already doable with a table in M2Doc. But M2Doc having limitations regarding complex tables (merge of cells), this is actually not a true alternative.
(more information about my operational need explained in the M2Doc Stackioverflow forum):
Workarounds to handle merged table cells in M2Doc - Stack Overflow
The need seems simple, but my first impression is that it will not be easy simply because MS Excel does not support displaying/rendering HTML content within cells.
One solution may be to write some functions that would transform the HTML into Excel cell formatting using this library: Example: Writing “Rich” strings with multiple formats — XlsxWriter Documentation - In any case, this will be limited by what Excel offers for formatting text within a cell: only text, changing its size/color/font/format. (You can see that if you enter in edition inside an Excel cell)
Thank you Stephane.
You are right, I noticed as well that Excel is limited with the internal formatting of cells and that alone has.killed my initial intent to rely on Python to generate complex tables for later insertion in M2Doc generated documents. The project I am coaching has agreed to a different formating of tables, that can be implemented by M2Doc.
The library you are pointing out seems to provide everything that is needed to write an XHTML parser for Excel. I will not develop it now because my original need has been dealt with. But I would say that the first of us who truly has this need and develops this (generic) parser should definitely be contributed it to Python4Capella!
Yes, and I am looking into it in some of my free time…
I answered on stackoverflow.
Please find attached the current status of my work, hopefully this will be contributed to Python for Capella:
HTML_RichText_Converter.py (16.7 KB)
HTMLDescription_To_Excel_Description.py (2.1 KB)
This script uses the standard xlsxwriter library (rather than the openpyxl we use in other example scripts) simply because it is impossible to do it with the openpyxl library. There is no additional component to install as this library is already in the package we deliver with Python4Capella.
This script works pretty well. There are significant limitations for the simple reason that Excel has some major limitations in what you can do within a cell (to make it simple, Excel cannot render HTML within a cell, so there are no images, no “true” links, no tables…). Still, the result is generally pretty convincing when the formatting in the Description is not too complex.
This script handles most of what you can do with Capella’s current HTML Description editor (font, color, sizes, etc…). I have the 2 final improvements to make: (1) better handling of table/td/tr tags (2) better handling of ul/ol/li tags when used in cascade; but what is there is already providing a pretty nice result I think.
This is a great contribution Stéphane, thank you very much.
While the limitations of Excel are showstoppers for my publication use cases, I definitely see how having this level of formatting rather than markup language will help other collaboration use cases.
I committed the two scripts see this commit.
Thank you for your contribution.