{"id":484,"date":"2014-08-01T13:10:57","date_gmt":"2014-08-01T11:10:57","guid":{"rendered":"http:\/\/blogs.igalia.com\/jaragunde\/?p=484"},"modified":"2014-08-01T13:10:58","modified_gmt":"2014-08-01T11:10:58","slug":"tales-of-libreoffice-interoperability-the-missing-files","status":"publish","type":"post","link":"https:\/\/blogs.igalia.com\/jaragunde\/2014\/08\/tales-of-libreoffice-interoperability-the-missing-files\/","title":{"rendered":"Tales of LibreOffice interoperability: the missing files"},"content":{"rendered":"<p>I&#8217;m welcoming the <a href=\"http:\/\/blog.documentfoundation.org\/2014\/07\/30\/libreoffice-4-3-today-you-cant-own-a-better-office-suite\/\" title=\"LibreOffice 4.3.0 announcement\" target=\"_blank\">release of LibreOffice 4.3.0<\/a> in the name of <a href=\"http:\/\/www.igalia.com\/\" title=\"Igalia website\" target=\"_blank\">Igalia<\/a> with the last post in <a href=\"http:\/\/blogs.igalia.com\/jaragunde\/tag\/lo-4-3-interoperability\/\" title=\"Tales of LibreOffice interoperability\" target=\"_blank\">this series<\/a>. This time we will talk about the preservation of embeddings in OOXML text documents, and devote some lines to the support of Standard Document Tags.<\/p>\n<h4>Embedded content in documents<\/h4>\n<p>Our goal this time was the preservation of embedded content in OOXML text documents, as a first step towards full support like we did with other features. The insertion of new embeddings in .docx documents or the edition of existing ones will have to come later in the future.<\/p>\n<p>An embedded document usually consists of two files; one of them is a preview picture to be shown in the parent document, and the other one is the actual embedded document. For the case of a spreadsheet embedded in a text document, the most common case, you will find these two files in the document:<\/p>\n<p>[code]<br \/>\n\/word\/media\/image1.emf<br \/>\n\/word\/embeddings\/Microsoft_Excel_Worksheet1.xlsx<br \/>\n[\/code]<\/p>\n<p>With the corresponding entries in the relations file:<\/p>\n<p>[code language=&#8221;xml&#8221;]<br \/>\n&lt;Relationship Id=&quot;rId2&quot;<br \/>\n  Type=&quot;http:\/\/schemas.openxmlformats.org\/officeDocument\/2006\/relationships\/package&quot;<br \/>\n  Target=&quot;embeddings\/oleObject1.xlsx&quot; \/&gt;<br \/>\n  &lt;Relationship Id=&quot;rId3&quot;<br \/>\n  Type=&quot;http:\/\/schemas.openxmlformats.org\/officeDocument\/2006\/relationships\/image&quot;<br \/>\n  Target=&quot;media\/image1.emf&quot; \/&gt;<br \/>\n[\/code]<\/p>\n<p>The relevant bits in the <em>document.xml<\/em> file are below. Notice a <em>w:object<\/em> consists of one shape, which is filled with data from an image file, and the OLE object itself, linked to the embedded spreadsheet. Also notice the attribute <em>ProgID<\/em>, which defines the program, document type and version:<\/p>\n<p>[code language=&#8221;xml&#8221;]<br \/>\n&lt;w:object&gt;<br \/>\n  &lt;v:shape id=&quot;ole_rId2&quot;<br \/>\n  style=&quot;width:362.25pt;height:146.25pt&quot; o:ole=&quot;&quot;&gt;<br \/>\n    &lt;v:imagedata r:id=&quot;rId3&quot; o:title=&quot;&quot; \/&gt;<br \/>\n  &lt;\/v:shape&gt;<br \/>\n  &lt;o:OLEObject Type=&quot;Embed&quot; ProgID=&quot;Excel.Sheet.12&quot;<br \/>\n  ShapeID=&quot;ole_rId2&quot; DrawAspect=&quot;Content&quot;<br \/>\n  ObjectID=&quot;_570182397&quot; r:id=&quot;rId2&quot; \/&gt;<br \/>\n&lt;\/w:object&gt;<br \/>\n[\/code]<\/p>\n<p>There is one more element that allows the embedded file to be properly detected by Word, it&#8217;s a content type definition in the content types file:<\/p>\n<p>[code language=&#8221;xml&#8221;]<br \/>\n&lt;Override PartName=&quot;\/word\/embeddings\/oleObject1.xlsx&quot;<br \/>\nContentType=&quot;application\/vnd.openxmlformats-officedocument.spreadsheetml.sheet&quot; \/&gt;<br \/>\n[\/code]<\/p>\n<p>As you can see there are three elements that determine the kind of embedding we are dealing with, and Word requires the right combination of the three of them: <\/p>\n<ul>\n<li>The properties in the  tag in <em>document.xml<\/em><\/li>\n<li>The ContentType for the file defined in <em>[Content_Types].xml<\/em><\/li>\n<li>The Type of the Relationship defined in <em>document.xml.rels<\/em><\/li>\n<\/ul>\n<p>The most convenient way to achieve our goal was using the <em>grab bag<\/em> technique to store the ProgID attribute of the object, and infer the correct content type and relation type. Some examples:<\/p>\n<ul>\n<li>An object with ProgID <strong>Excel.Sheet.12<\/strong> is a OOXML spreadsheet. Its media type must be <em>application\/vnd.openxmlformats-officedocument.spreadsheetml.sheet<\/em> and the relation type is <em>http:\/\/schemas.openxmlformats.org\/officeDocument\/2006\/relationships\/package<\/em>.<\/li>\n<li>If the ProgID is <strong>Excel.Sheet.8<\/strong>, this is an old Office spreadsheet. Now the media type must be <em>application\/vnd.ms-excel<\/em> and the relation type <em>http:\/\/schemas.openxmlformats.org\/officeDocument\/2006\/relationships\/oleObject<\/em>.<\/li>\n<\/ul>\n<p>If you detect a particular type of embedding in your documents that isn&#8217;t being preserved, drop us a line in the <a href=\"https:\/\/bugs.freedesktop.org\/\" title=\"freedesktop.org bugzilla\" target=\"_blank\">Bugzilla<\/a>. A patch to add new relations of this kind should be quick and easy.<\/p>\n<h4>Bonus track: Structured Document Tags<\/h4>\n<p>Structured Document Tags (SDTs) is a family of document objects that contains form-like controls, citations, contents tables or bibliography tables among many other. This variety of uses means that they can live inside a paragraph or they can be a high-level element that contains several paragraphs and even shapes, which of course is tricky to implement.<\/p>\n<p>For 4.3.0 we have worked on some of these tags, and we can say we properly implemented the import and export of combo, date and check boxes. We also wrote some code to preserve generic SDTs and now most of the tags are preserved but there are formatting issues. The proper way to support every kind of SDT is translating them to the equivalent objects in LibreOffice on import and translate them back to SDTs on export, but that will require time and work. Any volunteers? \ud83d\ude09<\/p>\n<h4>Wrap-up<\/h4>\n<p>Despite the 6-month development cycles, I&#8217;m feeling like the development of 4.3 line started a long time ago and I may have forgotten to write about some little feature or fix&#8230; Anyway, it&#8217;s time to close this batch of <a href=\"http:\/\/blogs.igalia.com\/jaragunde\/tag\/lo-4-3-interoperability\/\" title=\"Tales of LibreOffice interoperability\" target=\"_blank\">blog posts about interoperability features<\/a>, all of them <a href=\"http:\/\/www.igalia.com\/\" title=\"Igalia website\" target=\"_blank\">developed by Igalia<\/a> and <a href=\"http:\/\/www.cloudon.com\/\" title=\"CloudOn website\" target=\"_blank\">sponsored by CloudOn<\/a>.<\/p>\n<p>Enjoy our shiny new LibreOffice, and happy hacking!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;m welcoming the release of LibreOffice 4.3.0 in the name of Igalia with the last post in this series. This time we will talk about the preservation of embeddings in OOXML text documents, and devote some lines to the support &hellip; <a href=\"https:\/\/blogs.igalia.com\/jaragunde\/2014\/08\/tales-of-libreoffice-interoperability-the-missing-files\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,17],"tags":[20],"class_list":["post-484","post","type-post","status-publish","format-standard","hentry","category-igalia","category-libreoffice","tag-lo-4-3-interoperability"],"_links":{"self":[{"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/posts\/484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/comments?post=484"}],"version-history":[{"count":9,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/posts\/484\/revisions"}],"predecessor-version":[{"id":495,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/posts\/484\/revisions\/495"}],"wp:attachment":[{"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/media?parent=484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/categories?post=484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.igalia.com\/jaragunde\/wp-json\/wp\/v2\/tags?post=484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}