| Daniel's profileDaniel CamposBlogGuestbookNetwork | Help |
Daniel CamposIT Everywhere - OpenXML Enthusiast |
||||
|
May 05 Office Contest Winner 2009Hey friends,
I`m very proud to say that i won (1st place) the Microsoft Office Contest here at Brazil with the Images 2 OpenXML solution. The competitors should create an application that generates OpenXML files and uses Office as the base of the application (APIs, Front-Ends...). The Images 2 OpenXML project shows how to integrate the Office 2007 OCR engine with custom applications and use OpenXML and Speech Recognition as application features. The main proposal of the tool is to convert scanned documents to Office 2007 new format (OpenXML). The speech recognition improves the application user experience. The application uses the Office 2007 OCR API called MODI (Microsoft Office Document Imaging) that really helps to work with simple document translations. This project was based on the article that i wrote some months ago, the tool now is upgraded and more features will come. Hope you like the application.
You can check the results of the contest and the description of project of the winners at http://www.microsoft.com/brasil/office/resultadoconcurso/home.aspx (Portuguese).
See ya.
January 29 A little bit busy
Hey friends, Here i am again, first i want to whish you all happy new year (delayed). The things here at MIC goes a little bit busy, for now i'm working on lots of projects that soon will be released here, there's a lot of new things comming up next few days (months =P). For now i post here the last news about Office Development, OpenXML, and some other cool stuff that i found over the web. If you're trying to find some good read about OpenXML, here some tips that might help: Zeyad Rajabi keep posting some interesting articles about OpenXML, the last one is Traversing in the OpenXML DOM. Its basically a sequence of what Ali begins, they talk a little bit about OpenXML basics. Doug Mahugh bring us the news about the stabilization of the IS 29500 standard (a.k.a OpenXML format ;-)) , there's a very interesting article about the implementation notes for Office 2007 SP2, tou can check out here. Mary Lee from VSTO Team give us an explanation about how to deploy Office solutions using Windows Installer, you can see the article here. (I Know it's not new, but its very usefull). You can check out one interesting video about how to customize a Office Ribbon Bar within custom applications using MFC and a free software called Axialis. here's the video. Ther a cool post on Code Project explaining how to create iPhone UI for Windows Mobile, int his article you'll be intrduced to concepts like AlphaBlend, imaging load and transparency and so on. Here's the link http://www.codeproject.com/KB/mobile/IPhoneUI.aspx. If you want to see this code with some additions (to work with PocketPC 2003 and so on) send me an e-mail. December i spoke at MAD (Microsoft Academy Day) presenting the OpenXML SDK 2.0, and the development with VSTO 3.0, there's some pictures here: (if you want the demosa and the presentation send me an email too)
Finally here's a awesome picture of my vacations ;-) By the way, this is where i live.
See ya soon guys October 06 Microsoft Student to Business Program
Hey guys, Last friday 9/03/2008 we started the 4th edition of Microsoft Student to Business program here at Brazil. We had 36K + registered people throughout the country. The Students to Business (S2B) program is a Microsoft® Community Initiative designed to connect Microsoft partners and customers with qualified students for entry-level and internship positions. The program is composed of three stages, the first one is a big class where we explain the local market, job positions and so on (related to the IT area), the second stage consist of 36+ hours of traning in two different tracks: System Development (using .NET) and Network Administration (Windows Server Technologies), the third stage consist of development of an Application using the technologies learned during the second stage (.NET, VSTS, WinForms, ASP.NET)... or solving an speified problem with Microsoft Network Administration technologies. The attendees will have the totally 80+ hours of training with the most specialized professionals (the guys of Microsft Innovation Centers, MVPs and so on). Congrats to everyone who passed to the second stage!!!
See ya!!! September 19 Converting images to text using Office 2007 OCR, OpenXML and Speech Recognition
Hey folks, Last week i've posted an article on the code project portal and i'll reproduce it here: IntroductionSometimes at the development of an application we face situations where we have a scanned document (image) and we want to convert it to text (Word 2007 document). Some scanners provide applications that automatically perform this kind of convertion, but in the most times the generated document format is a .pdf or .odt and so on. If you want to convert directly to .docx (OpenXML) documents, you'll have to use third-party applications or develop it from scratch. OpenXML became a ISO standard (IS29500) and its adoption is growing up day after day driven by its performance, scalability and security. The format is the default format of Microsoft Office 2007 documents (.pptx, .docx, .xlsx). It's 75 percent smaller than compared binary documents and based in two major technologies: ZIP and XML. ScenarioTo facilitate the work of developers and avoid the integration with third-party applications, Microsoft release with Office 2007 one OCR (Optical Character Recognition) API that's called MODI (Microsoft Office Document Imaging). It's important to remember that the API used in this sample is exclusive of Office 2007 (Office 2003 has its own OCR API). In this article we'll create an windows application that uses the Office 2007 OCR API to generate OpenXML documents. In Addition we'll use the Speech Recognition API to improve the application User-Experience. Before we start it's necessary that you already have the followinr requirements installed:
It's necessary that you have installed the Microsoft Office Document Imaging 12.0 Type Library. The Office 2007 installation setup doens't install this component by default, being necessary to install it later. To do this:
Using the MODITo use the Office 2007 OCR API, you have to add a reference to Microsoft Office Document Imaging 12.0 Type Library, to do this:
Create a MODI object: At the Form class constructor instantiate the MODI object: After that you just have to implement the conversion method, let see how to do this: The method OCRImplementation will convert image files (.tif, .jpg, .gif, .bmp, in this case we're using a TIFF file). The method Create of the md object receives the path of the file to be converted. The OCR method receives three parameters, the first on represents the language of the document, the second parameter specifies whether the OCR engine attempts to determine the orientation of the page and the third parameter specifies whether the OCR engine attempts to fix small angles of misalignment from the vertical. To retrieve the text, it's necessary add references to the properties of the objects Image and Layout. The object Layout allow the text retrieval. The property Words of this object contains the property Count that allows the iteration through the list of words. You can retrieve the words using indexers, instead we're adding blank spaces between the words. The method Close of the md object takes a boolean argument indicating whether to save changes to the image file.
Using OpenXML SDKIn the Solution Explorer add reference to the DocumentFormat.OpenXML library. This library allows the converted text becomes a word document. There's a constant object that will handle the structure and relationships of the document (It'll define the markup, in this case WordprocessingML). The method CreateDocument is responsible for insert the text inside the document structure. Speech RecognitionAdd a reference to System.Speech at the .NET tab. After that you just have to adjust the Volume and Rate properties and use the method Speak to speak a string. ConclusionIt is an interesting idea to combine these powerful APIs, the OCR implemented code is very short if compared with third-party APIs. It is a tool that can be explored in many ways and if integrated with the benefits of OpenXML and Speech Recognition improves your applications. You can download the code here
See ya. August 28 Creating OpenXML Documents with SDK 1.0
Hey folks, Recently some guys asked me how to create word 2007 documents (.docx) without use Microsoft Office Word 2007. Thats a reason to do this post. It's basically an idea about how to create word documents with a custom application, you can improve this application adding new features such as bold, italics, colors, size configurations and so on. Well lets begin our work: First you need the OpenXML SDK 1.0 installed, if you don't have it you can download it here. Let's assume that you already have the SDK installed, so lets begin. At Visual Studio, create an Windows Forms application and name it as CustomTextEditor. At Solution Explorer add a refence to DocumentFormat.OpenXML namespace (It's located at .NET tab). Add a TextBox control on the windows forms. Set the Multiline property to True and the ScrollBars property to Vertical. Add a label and change the Text propety to Save Path. At the right side of the label add a text box, and at the right side os this text box add a button. Set the button property Name to btn_Path and the Text property to ' ... ' Add three more button to the form:
Add a saveFileDialog to the the form. You should have something like this: Two clicks at each button to generate the events. At the code window add the following code: This constant is responsible for the definition of the document body (Relationships and markup). Now we'll create a method that will create the document, this method receives a string as a paramenter, this string is basically the text written on the TextBox1. First we have to do is to create an object to define the package. For this task we have the class WorprocessingDocument that's responsible for define the package that represents a Word document. As you can see, this class have a Create method, this method receives to parameters. The first parameter is the path that your document will be saved, the second parameter is a enum that will define the type of the document (Document, MacroEnabledDocument, MacroEnabledTemplate or Template). After this we'll define the Main document part: The property MainDocumentPart gets the ma in part of the WordprocessingDocument, The method AddMainDocumentPart will create the main document part and adds it to the document. After this we'll create a string that will replace the tag #OLD_TEXT#. It will insert the text from the text box in the structure (It'll be formated within a WordprocessingML structure). To replace the text we'll use the method Replace of the String class. After this we'll have to get the Stream of the document part, encoding it and insert the string that contains the structured text and save it. The method Close is responsible for save and close the OpenXML Package. Let's implement the button events: The first event that we have to implement is the btn_Path event, this event is responsible for defines the name and the path of the file. Implement the btn_Save event. This event will call the method CreateDocument. Implement the other two events as the following code:
Well guys this is all. This is a simple application but it demonstrate a little bit of the OpenXML SDK power, you can extend your custom applications using this SDK.
See ya |
There are no categories in use.
|
|||
|
|