Find and Replace with StructuredDocumentTag

September 12, 2016, 4:26 am

≫ Next: Working with StructuredDocumentTags

≪ Previous: TypeInitializationException

Hi,

I want to find text in an exiting word document and replace it with StructuredDocumentTag.

All "find - replace" examples i found are based on Runs and i can't insert a StructuredDocumentTag to a Run node.

What should i do in this case?

Thanks

↧

Working with StructuredDocumentTags

July 16, 2015, 8:43 am

≫ Next: Unable to read tables data from StructuredDocumentTag directly

≪ Previous: Find and Replace with StructuredDocumentTag

Hi,

We are now evaluating Aspose.Words to use it in our product.

During the evaluating we are building a demo application to demonstrate Aspose.Words potential internally.

The demo application should work with Word templates (read and write) and we have the following questions:

1) How to create a StructuredDocumentTag that allows the user to enter only numbers (and how to control if the user can use the dot (.) sign)?

2) How to set the “watermark” (placeholder) of a StructuredDocumentTags?

3) How to create a repeating section content control?

Thanks,

Omri

↧

Unable to read tables data from StructuredDocumentTag directly

May 23, 2016, 10:03 am

≫ Next: Text scrambled when converting Word to HTML

≪ Previous: Working with StructuredDocumentTags

Hi,

Attached a document with some StructuredDocumentTags. In the StructuredDocumentTags there is text and tables data.

If I try to convert the nodes to html in a generic method I can get all the data from the document. But if I try to convert only the StructuredDocumentTags I don’t get the tables data.

All the tables and the texts are inside the StructuredDocumentTags so logically it should be the same.

I think this is a bug in your parsing - you think that the table is outside the StructuredDocumentTag but word shows that it is inside.

Here is an example code that takes this word document and create 2 htm files, one is good (using the generic approach) and one is bad (recursively going over the StructuredDocumentTags and getting the data out of them).

var html = string.Empty;

var htmlSaveOptions = new Aspose.Words.Saving.HtmlSaveOptions

{

ExportImagesAsBase64 = true,

ExportHeadersFootersMode = Aspose.Words.Saving.ExportHeadersFootersMode.None

};

//Generic approach - go over all the nodes

using (var inputStream = File.OpenRead(@"E:\WordTest\11.docx"))

{

var doc = new Aspose.Words.Document(inputStream);

CompositeNode parent = doc;

foreach (Aspose.Words.Node node in doc.ChildNodes)

{

html += node.ToString(htmlSaveOptions);

}

File.WriteAllText(@"E:\WordTest\11_good.htm", html, Encoding.UTF8);

}

// Exclusive approach - go over the StructuredDocumentTag only

html = string.Empty;

using (var inputStream = File.OpenRead(@"E:\WordTest\11.docx"))

{

var doc = new Aspose.Words.Document(inputStream);

html = ReadStructuredDocumentTagOnly(html, htmlSaveOptions, doc);

File.WriteAllText(@"E:\WordTest\11_bad.htm", html, Encoding.UTF8);

}

Helper method:

private static string ReadStructuredDocumentTagOnly(string html, Aspose.Words.Saving.HtmlSaveOptions htmlSaveOptions, CompositeNode parent)

            foreach (Aspose.Words.Node node in parent.ChildNodes)

                if (node.NodeType == NodeType.StructuredDocumentTag)

                    StructuredDocumentTag structuredDocumentTag = (StructuredDocumentTag)node;

                    foreach (Aspose.Words.Node textNode in structuredDocumentTag.ChildNodes)

                        html += textNode.ToString(htmlSaveOptions);

                else

                    if (node is CompositeNode)

                        if (((CompositeNode)node).ChildNodes != null

                            && ((CompositeNode)node).ChildNodes.Count > 0)

                            html = ReadStructuredDocumentTagOnly(html, htmlSaveOptions, (CompositeNode)node);

            return html;

Please fix this bug or give us advice how to work around it.

Thanks

↧

Text scrambled when converting Word to HTML

June 28, 2016, 12:30 am

≫ Next: TXT file rendered not as expected.

≪ Previous: Unable to read tables data from StructuredDocumentTag directly

Hi,

For some reason when we convert the following document to html the text in the rows is scrambled, it seems like the last word becomes the first.

Here is our code:

var sourcefile = @"E:\WordTest\15.docx";

var html = string.Empty;

var htmlSaveOptions = new Aspose.Words.Saving.HtmlSaveOptions

{

ExportImagesAsBase64 = true,

ExportHeadersFootersMode = Aspose.Words.Saving.ExportHeadersFootersMode.None

};

using (var inputStream = File.OpenRead(sourcefile))

{

var doc = new Aspose.Words.Document(inputStream);

CompositeNode parent = doc;

foreach (Aspose.Words.Node node in doc.ChildNodes)

{

html += node.ToString(htmlSaveOptions);

}

File.WriteAllText(@"E:\WordTest\15.html", html, Encoding.UTF8);

}

Please advise how to workaround this bug or release a fix.

Thanks!

↧

TXT file rendered not as expected.

November 7, 2016, 4:38 am

≫ Next: Aspose.Words .Net4.5.2 Support

≪ Previous: Text scrambled when converting Word to HTML

Hi,

Attached is a simple TXT file. When the file is rendered some unexpected points are added.

For example "1 test" is rendered as "1. test" - see attached screenshots.

Is this a bug or a feature? Is there a setting to avoid such text manipulations?

Best Regards,

Vassil

↧

Aspose.Words .Net4.5.2 Support

November 7, 2016, 9:09 pm

≫ Next: importing html extremely slow in linux

≪ Previous: TXT file rendered not as expected.

Hi,

I am using Aspose.Words version 9.7.0.0 ,I am looking to upgrade my application to 4.5.2 ,will the dill be supported?

↧

importing html extremely slow in linux

November 6, 2016, 6:54 pm

≫ Next: OutOfMemory with updateFields and other operations

≪ Previous: Aspose.Words .Net4.5.2 Support

Hi,

We are evaluating the Aspose solution to convert HTML file to doc format. We find it takes less than 90 seconds to load the attached html file into document model in a windows environment. While we deploy it to a linux box, it takes more than 400 seconds doing the same.

In the beginning, we thought it might be the TrueType issue with linux, so we upload the whole windows truetype files to a folder and use code like FontSettings.getDefaultInstance().setFontsFolder(fontFolder,true); Unluckly, it doesn't seem work. On the other hand, it is quite quick to load the generated doc file which is converted from the attached html file.

Code snippet is as below:

FontSettings.getDefaultInstance().setFontsFolder(fontFolder,true);

File file = new File("456456test.html");

FileInputStream fis = new FileInputStream(file);

Document doc = new Document(fis);

doc.save(fos, SaveFormat.DOC);

Thanks.

↧

OutOfMemory with updateFields and other operations

August 9, 2011, 4:57 am

≫ Next: System.ArgumentException while saving odt document

≪ Previous: importing html extremely slow in linux

Hey all,

the issues reported here were discovered while investigating the workaround suggested on this topic low performance when save document to PDF format through Aspose Word Java library

Using updateFields, getPageCount or saving a PDF/XPS (regardless if only the 1st page are all pages) for a document larger than 3000 pages will cause an OOM exception on 32bit JRE with 1Gb of heap available. The test code is attached if needed.

From the discussions on the other thread I can assume the problem is caused by Aspose creating the APS (Aspose Page Specification) model in memory. And while I understand why this is happening and the technical challenges, this is a serious limitation of the updateFields functionality in addition to the PDF save.

Regards,
Dragos

↧

System.ArgumentException while saving odt document

June 24, 2016, 4:02 am

≫ Next: Replace a word using an image

≪ Previous: OutOfMemory with updateFields and other operations

Hi,

I'm getting error while saving same pages of attached file (ex: 6th, 8th) as png.

Aspose.Words.Saving.ImageSaveOptions imageSaveOptions = new Aspose.Words.Saving.ImageSaveOptions(Aspose.Words.SaveFormat.Png);

imageSaveOptions.PageIndex = pageNumber - 1;

imageSaveOptions.PageCount = 1;

imageSaveOptions.Resolution = 96;

doc.Save(@"c:\test.png", imageSaveOptions)

↧

Replace a word using an image

November 6, 2016, 7:53 pm

≫ Next: Adding Watermark to Document

≪ Previous: System.ArgumentException while saving odt document

Hi,

Is it possible to replace a text in a word document using an image. I was able to replace an image using an image. But I want to replace a word using an image. Is there a way to do it.

Thank You.

↧

Adding Watermark to Document

November 2, 2016, 12:16 pm

≫ Next: Corrupted RTF with footnote

≪ Previous: Replace a word using an image

I am using Aspose,Word to convert image to pdf.

In Addition I want to add watermark that will appear in all pages,

To add the watermark I added Shape to the Document.

The shape has been added properly, however the image cover some part of the watermark.

Can you assist me find how to make image behind the watermark?

this is the code i wrote for adding the stamp:

Aspose.Words.Drawing.Shape watermark = new Aspose.Words.Drawing.Shape(doc, Aspose.Words.Drawing.ShapeType.TextPlainText);

watermark.TextPath.Text = watermarkText;

// Create a new paragraph and append the watermark to this paragraph.

Paragraph watermarkPara = new Paragraph(doc);

watermarkPara.AppendChild(watermark);

this is the code i used to add image:

builder.InsertImage(image, RelativeHorizontalPosition.Page, leftPos, RelativeVerticalPosition.Page, tmpTopMargin + 20, fixedSize.Width, fixedSize.Height, WrapType.Through);

sample attached

Thx

Yaniv

↧

Corrupted RTF with footnote

November 8, 2016, 5:12 am

≫ Next: Re: The document appears to be corrupted and cannot be loaded

≪ Previous: Adding Watermark to Document

Hi,

Aspose Words for Java (tested v16.11.0) cannot read the following RTF file.
"com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded."

We don't know how this file was created, but it seems that there is an invalid footnote on a table cell (and not a paragraph). MsWord 2010 seems to keep this footnote while saving to RTF. However it is lost when saved to DOCX.

I suggest that such invalid footnotes are ignored, like MsWord does when saving to DOCX.

Thanks

Romain

↧

Re: The document appears to be corrupted and cannot be loaded

November 8, 2016, 7:05 am

≫ Next: Replacing Error

≪ Previous: Corrupted RTF with footnote

I'm trying to convert a 'odt' file to 'pdf' and 'pdf-a' and this conversion sometimes doesn't work fine, I get the error message "The document appears to be corrupted and cannot be loaded" but the file seems to be ok, I'm using the version of Aspose.Words 15.4.0.0 and I attached the file doesn't work.

Thanks

Davide

↧

Replacing Error

September 20, 2016, 12:47 am

≫ Next: Images in table cells

≪ Previous: Re: The document appears to be corrupted and cannot be loaded

Error image,doc file and source code(global.zip) in the attachment files.

Thanks

↧

Images in table cells

November 8, 2016, 3:03 am

≫ Next: Charts in Word

≪ Previous: Replacing Error

Can we insert multiple images in word table cell?

I am having list of objects which contains list of images. can i insert them in each row for a cell?

↧

Charts in Word

November 8, 2016, 2:58 am

≫ Next: border-collapse and border-spacing seems to be ignored

≪ Previous: Images in table cells

Does Aspose provide charts like PIE, METER, BAR ,LINE in word document?

If yes what is the version number?

If no what is the alternate in aspose? please suggest.

↧

border-collapse and border-spacing seems to be ignored

November 9, 2016, 1:44 am

≫ Next: Copying Sections - How to reduce space between sections?

≪ Previous: Charts in Word

Hi,

we have a html with css. The table is formatted with border-collapse and border-spacing.

You can see that the word document has the required formatting. If we save the document as pdf, there is no border-spacing at all.

We are using Aspose.Words 16.11.0

2x Windows 10 64 Bit 1x Windows 7 64 Bit

Word 2013

Here is a Test Project:

import com.aspose.words.Document;
import com.aspose.words.DocumentBuilder;

/**
 * Created by Alexander.Joerg on 09.11.2016.
 */

public class AsposeTest {
public static void main(String[] args) {

try {

            String html = "<style>\n" +
"    #article {\n" +
"        width: 100%;\n" +
"        border-collapse: separate;\n" +
"        border-spacing: 5px\n" +
"    }\n" +
"\n" +
"    #article td, #article th {\n" +
"        font-size: 1em;\n" +
"        border: 1px solid #98bf21;\n" +
"        padding: 3px 7px 2px 7px;\n" +
"    }\n" +
"\n" +
"    #article th {\n" +
"        font-size: 1.1em;\n" +
"        text-align: left;\n" +
"        padding-top: 5px;\n" +
"        padding-bottom: 4px;\n" +
"        background-color: #a7c942;\n" +
"        color: #fff;\n" +
"    }\n" +
"\n" +
"    #article tr.alt td {\n" +
"        color: #000;\n" +
"        background-color: #eaf2d3;\n" +
"    }\n" +
"</style>\n" +
"<table id=\"article\">\n" +
"    <tr>\n" +
"        <th>Position</th>\n" +
"        <th>Article</th>\n" +
"        <th>Desc</th>\n" +
"        <th>Tax</th>\n" +
"        <th>Amount</th>\n" +
"        <th>Unitcost\n</th>\n" +
"        <th>TotalPrice</th>\n" +
"    </tr>\n" +
"    <tr class=\"alt\">\n" +
"        <td>1</td>\n" +
"        <td>0000001</td>\n" +
"        <td>Table</td>\n" +
"        <td>19,00</td>\n" +
"        <td>1 ST</td>\n" +
"        <td>250 EUR</td>\n" +
"        <td>250 EUR</td>\n" +
"    </tr>\n" +
"    <tr>\n" +
"        <td>2</td>\n" +
"        <td>0000002</td>\n" +
"        <td>Bench</td>\n" +
"        <td>19,00</td>\n" +
"        <td>2 ST</td>\n" +
"        <td>100 EUR</td>\n" +
"        <td>200 EUR</td>\n" +
"    </tr>\n" +
"    <tr class=\"alt\">\n" +
"        <td></td>\n" +
"        <td></td>\n" +
"        <td></td>\n" +
"        <td></td>\n" +
"        <td></td>\n" +
"        <td><b>TotalPrice</b></td>\n" +
"        <td><b>450 EUR</b></td>\n" +
"    </tr>\n" +
"</table>\n";


            Document document = new Document();
            DocumentBuilder documentBuilder = new DocumentBuilder(document);

            documentBuilder.insertHtml(html);

            document.save("C:\\temp\\testDocument.docx");
            document.save("C:\\temp\\testDocument.pdf");


        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

↧

Copying Sections - How to reduce space between sections?

November 8, 2016, 3:21 am

≫ Next: Copying Sections

≪ Previous: border-collapse and border-spacing seems to be ignored

Hi,

I have got a word document below (testContents.doc) where I essentially have 3 sections. Inside the 2nd section I have a table which stores some merge fields. In my code I copy section 2 'x' times so I can build a contents page. Overall this works fine so If I want 2 items in my contents page (like in example) I copy section 2 and append it at the bottom. This works perfectly but annoyingly I have a massive gap between the sections is there any way to reduce / remove this? Please look at attached picture to see what I mean as this is the output I get.

Simplified code is below. This is not my 100% code as it is very complicated but below is basically what I do

I copy section

SectionCollection sections = contentPage.Sections;

Section copy = sections[1].Clone();

I alter section "copy" by updating merge fields. I then add section to document using code below

contentPage.Sections.Add(copy);

After all sections are added I run below to combine all sections into one

Section[] cSections = contentPage.Sections.ToArray();

                //Append content of all section to the first section
                for (int i = 0; i < cSections.Length; i++)
                {
                    Section sect = cSections[i];
                    if (!sect.Equals(contentPage.FirstSection))
                    {
                        contentPage.FirstSection.AppendContent(sect);
                    }
                }

                //Remove all sections but first
                while (!contentPage.LastSection.Equals(contentPage.FirstSection))
                {
                    contentPage.LastSection.Remove();
                }

↧

Copying Sections

November 2, 2016, 3:19 am

≫ Next: Symbol are wrong in the genarted PDF rendition (CTS-4714)

≪ Previous: Copying Sections - How to reduce space between sections?

Hi,

I am making some pdfs for some financial clients and have had no problems as aspose is great but have come across a stumbling block and not sure how to get this working. The general gist of the problem is I want to copy sections on a word file but not sure if the best way of doing it. Explanation and files below....

I have a word file which has a heading called Account Type. Beneath this heading is another header called account name and below that header is a table that lists all transactions linked to that account. The idea is that my c# code should copy parts of this word file and copy them if there are multiple account types and multiple accounts under a type. Please see file called desiredOutput to see what I would expect to see.

The issue I am having is how can I copy parts of the word file I require? For example In the file called desiredOutputTemplate I am trying to use tags so If there is more than one AccountType copy everything between [REPEAT-SECTION-ACTYPE] and if there is more than on account under accountype copy everything between [REPEAT-SECTION-ACNAME]. The idea is that I would have a section called UK Accounts and under that list all uk accounts and if there was another type called US Accounts I would then show the title US accounts and list all accounts beneath it. This is totally dynamic so there could be infinite number of account types and accounts under each type.

I have done some work on this so I can copy tables e.t.c. but not sure how you copy content between 2 tags?

Maybe you guys can think of a better way to get round this problem not using tags?

If you need any more info please let me know as It is pretty complicated.

↧

Symbol are wrong in the genarted PDF rendition (CTS-4714)

November 9, 2016, 2:42 am

≫ Next: Get individual revisions

≪ Previous: Copying Sections

Hi,

While converting attached document to pdf, we see that lot of small symbols change during the conversion. E.g. symbols like hyphen becomes ###, and so on. I have highlighted few occurrences in yellow for demonstration (though there are many more in the file). Please check.

The issue is critical, would appreciate if you can keep the bug severity as high.

Thanks,

Rajiv

↧