Category: Articles

Fonts, what are they and popular font formats

The modern world is always looking to improve on technology, and the advancements and font technology is no exception. The array of font styles eclipses any imagination and thus allows for the adaption of a distinct font to present an assortment of material.

What are fonts?

Fonts are sets of displayable or printable letters, digits or symbols in specific styles and sizes. Typeface depicts the various styles found in fonts which form typeface families. Some features found in each font family (dialogue box) are font styles, font colour, text effects and font character spacing. For example, Times New Roman is a TrueType font. A document with a selected font will be viewed and printed as seen on screen. In the Times New Roman typeface family; font styles, text highlight colour and underline, for example, form a part of many fonts one can select.

Font sizes range from 8 to 72 points with the most preferred font pitch sizes being 10 and 12. There are two types of fonts, namely pitch font and proportional fonts. Pitch font letters have an equal diameter, with proportional fonts being the opposite. In general, choosing the appropriate typeface family and related fonts really depends on the data and preferred appearance one would want to display. The appearance of a page on screen view or print will determine the choices one would make.

There are three computer font file data formats:

  1. Bitmap fonts (Raster fonts)
  2. Outline fonts (Vector fonts)
  3. Stoke fonts

Bitmap fonts

This format consists of pixels or matrix of dots representing the image of glyph or specific shape in each face and size. Raster font is the term used for Bit-map fonts but is not frequently used. A printer identifies the relevant characters saved for the document and subsequently reproduces the matching dots. A variance on the typeface will generate suitable Bitmaps for the content. Bitmap fonts are generally considered to be device dependent.

Advantages of Bitmap fonts are:

  • Exceptionally fast and more straightforward to operate
  • Much simpler to generate
  • Bitmap fonts show exactly the same results when presented on corresponding image or data visuals
  • Maintaining smaller visuals will exhibit optimum results for Bitmap fonts

The poor visual quality of Bitmap fonts is primary a disadvantage, particularly when sizes are enlarged, in comparison to other fonts such as Outline and Stoke fonts. The fonts, however, are nevertheless a standard feature in some computer operating systems. Digital Bitmap fonts may use monochrome fonts or ‘shades of grey’. The Bitmap fonts remain widely used in computer systems, particularly for screen displays.

Outline fonts

Also known as Vector fonts, the character outlines are scalable to any size as the font uses drawing instructions and mathematical formulae to describe each particular shape, known as Bezier curves. In vector structures, each image is interpreted algorithmically. Presented dimensions differ with font styles. Outline fonts are therefore known as scalable fonts as their height and width can be adjusted. Vector fonts are also known as image-oriented fonts, apart from the scalable font phrase. The main advantage they have over Bitmap fonts is that they are favourable when using prime quality image machines. The higher the quality, the more superior vector font appears. There are three formats which are used in online fonts

  • Type 1 and Type 3 fonts – This type was created by the Adobe. The purpose was to use this font for professional digital typesetting.
  • True Type fonts –This was introduced by the Apple Inc. and is very popular with implementations available.
  • Open Type fonts – Designed by Adobe and Microsoft and deemed a smart font system, outstanding and incomparable to any other fonts, redesign visual character presentations to improve the reading experience.

Stroke fonts

This font uses a series specified lines and other information to define the form, shape or size of the line in a specified face. In East Asian markets it is promoted and advertised on a large level. The use on an embedded level is widespread in these markets, the advantage of stroke fonts include reducing the number of vertices required to define a glyph. Met fonts are stroke-font formats.

PS Files

Developed by Adobe, PS files are widely used by publishers for printing purposes. Postscript files (PS files) are images saved in the Postscript page description language and may contain vector graphics, Bitmap graphics or text. Postscript files contain images and text on the same page.  A postscript file is distinguished by its’ .ps’ suffix.

Conclusion:

There are so many different types of the fonts, and their popularity is different for various industries and locations. Overall PS files fonts are widely used by publishers for printing purposes. Other have their importance and popularity in various industries.

Our PS Converter supports conversion of PS files into various other formats, for example, you can use it to convert a PS file to BMP.

Five reasons, why you should convert Word Document to PDF format

Microsoft Office is one the most used Office software by the companies and individuals. Almost every document or presentation can be made in Microsoft office. This article was written on Microsoft Word, and it can be said the software, and its feature does not disappoint the user. However, there are several reasons why you should consider converting DOC files into PDF format. This is because converting to PDF provides greater opportunities and options for the writer in the contemporary world. Let’s have a look at them.

Word document change their format, PDFs do not

Imagine spending a whole day writing a long article and choosing the best format and page size, taking it to the office and opening it in a new computer and formatting it all over again! This is one of the biggest problems with the DOC format. When opened on a different computer or with different software, the whole document changes to a default format which ruins the document. It sometimes not only alters the format settings but, since some software may not support Microsoft Office feature, change the whole document setting, thus, the document looks entirely different.

One of the main reasons why PDF was developed by Adobe was to provide a uniform platform for every type of document, especially with DOC format, so that they can be viewed on any kind of device and operating system. This meant that once the document is converted to PDF, its style and format will become absolute and cannot be changed or altered in any way regardless of the device or system.

References can cause a lot of trouble

As mentioned above, if opened on different software or a computer, your contents can get disorganized. This commonly happens in an e-book or documents with content. Since on a different computer or software, the document is opened in default settings; thus page size can also change along with margins or guidelines. This also results in different page numbers and line spacing. It is yet another reason why writers prefer PDF format for their books or articles which may include pictures or table of contents.

Different version, different problems

Every software was once old; as time went by, newer and better versions were introduced in the market with advanced features and improvement. Same happened with Microsoft Office, although it created a small problem along with the improvements. Documents created in the new version of Word will rarely open in the older versions of Word since they lack several features which are prominent in the new versions. In the other way around, newer versions can open the old version documents but with slight changes. This can result in document alterations. Change of font or image display is some of the problems accompanied by the common Word problems mentioned above. As stated, it can cause format change or alter the content orientation as well.

Word is not the only one in the market

Billions of people use computers every day, and millions of them use it to write their books, articles, and whatnot. There is a very high probability that not all of them are familiar with Microsoft Word, and they may be using any other word processing software. This means that their document can be in different style, different font, size, or orientation. Although they can be opened in Word, a different view of the document can be highly expected. This means it may choose a default font for the document if the original font is not available in Word. Or it can cause the document to reformat which means the page size can change and so will the page numbers. You will experience the same consequence when opening a Word file in different word processing software. PDF means you can open your file in any software without expecting a single change in the format of the document. This ensures your documents remain intact and unaltered.

What is better than software which is mobile friendly?

In this fast developing world, most of the business and digital work depends on computers. Since you cannot access your computer everywhere, some people offer an alternative to using mobile devices to remain updated on their work. They often view documents and presentations on their phones or tablets. This becomes very difficult when the file they want to have a look at is in DOC format. Why? Because of most of the times, DOC formats require different apps which are not produced by Microsoft. This means they may not support some features resulting in alterations. Furthermore, apps that are good for doc formats are rarely free and cost a good fortune. PDF, on the other hand, have lots of free and user-friendly apps which can be downloaded on any mobile devices and allows the user to view PDF file without any problems.

Our DOC to PDF Converter provides an easy and hassle-free way to convert Word documents into PDF files, so you can always use it to convert your DOC files, or you can use our DOC Converter to convert DOC files into many other formats besides PDF.

OCR (Optical Character Recognition) – a technical overview

Optical Character Recognition (OCR) is software that assists in reading text, translating, and converting an image into a text file. OCR system comprises of the optical scanner for reading the text and disenchanted software for converting an image to a text. This software ease reading of complicated letters, books, and journals. OCR has the capability of reading text in large various fonts. Unfortunately, OCR does not provide a great support to handwritten text files. This article will provide you with the technology behind OCR, an essential element of OCR, principles based on OCR, and how one can convert scanned image to the editable file using OCR.

Technology Behind Optical Character Recognition

The most advanced type of OCR currently is ABBY FineReader OCR. Usually, OCR works with three basic principles- integrity, purposefulness, and adaptability. OCR is easy to use and consist of three steps which are scanning a document, recognizing it, and store it in the right format (RFT, XLS, PDF, and TXT). How does OCR recognize text? It isolates document pages to an element like the block of text, image or even tables. The line is partitioned to the word and then to the character for them to be recognized. After character have been recommended the software do a comparison with a set of the image pattern. It then improves diverse hypothesis about character recommended. Regarding this hypothesis, the program inquires different effect of subdividing lines into word and word into characters. After processing colossal numbers of such probabilistic hypothesis the software proceeds with its decision of presenting you with the recognized text. Modern OCR such as ABBY FineReader can support 45+ languages from the dictionary. This facilitates auxiliary inquiry of text element on word unit. It ensures more accurate inquiry and recognition of document and assists in getting text information from the complex document. These are the three basic principles of equipped on the OCR with maximum reliability and brilliance that make it possible for human recognition.

Essential Element for OCR

The essential elements are scanning and recognition. Both two elements involve various procedures.

Recognition: images captured through scanning digital camera can be as well recognized by OCR for them to be converted to text form. A digital camera needs to have bright light for their images be recognized by OCR. Modern OCR such as ABBY FineReader has dependable recognition technology that targets processing camera images. They have been well built to counter image bias at the edge, perfect recognition.

Scanning: this software can scan to two types; scanning to pdf and scanning to a word. Scanning to pdf ensure layout accuracy. Scanned document retain original outlook on screen resembling virtual photocopy. One can click on a single word or listen to overall document. Scan to a word is done for flexibility thus providing power to edit and change text layout

Converting Scanned Images to Editable File

It involves various steps when converting scanned images from source to editable file using OCR.

Step 1:

Involves detecting the direction of the text. The scanned image is never perfectly aligned hence you slightly need to rotate scanned images so that text line become 100 % horizontal.

Step 2:

Involves discovering whether the text is unit column or double column

Step 3:

In every column, you need to locate ‘baseline’ position of the consecutive text line. Double column text needs to be changed to a single column “long ribbon of printed character”. The format used is black and white.

Step 4:

Express this ribbon into unity character by recognizing vertical stripes of the white pixel. Each “token” is of rectangular mini-image of black/white pixel. In case two tokens are sandwiched by more than average white space you can add “space” token

Step 5:

Go through the token, comparing one by one with a pixel of known characters (letters, number, etc.). Find the length of a token and each of selected templates. You require selecting short length character being the right one. But in case this step doesn’t return token for you take unit character but instead probabilities.

Step 6:

Sum variety of probabilities called language model which is always specific to a language, e.g. English. Example, likely previous letter “who” is accompanied by the letter “I” or digit “1” which has same probability in the next token. However, language model will lean on the letter”I” instead of “1”. OCR lacking this step typically produce many non-sensical word –language free but the one with language model produce ideal transcript no matter the blurred image( image captured with an out of focus camera or the one printed both side of paper where text of back can penetrate through)

To convert your scanned images into an editable format, you can use our online OCR Conversion tool.

 

XML: what is it? History and Uses

There are varieties of file extensions, and the new one with variation pop up each day. For an IT expert, the XML file extension does not seem like anything new. But for a new user of IT XML might be sound like an extension for malware. But it is not a malware; it is one of the widely used file extensions. XML actually means Extensible Markup Language. It means that it is kind of a markup language to interpret and add data to be understandable by humans as well as machines. So XML files are used to interpret, transport, structure and store data. It was designed with the aim of the generality of usage across all over the Internet. Dr. Charles Goldfarb who was involved in the development of XML file system, says that XML is a kind of holy grail for computing because it overcomes the problem of data exchange universally between non-identical systems.

Background

The true history of XML actually goes back to 1970’s when three employees of IBM, Charles Goldfarb, Ed Mosher and Ray Lorie introduced a new technique of forming technical documents, called GML, in which tags were used to structure the data. The name actually consisted of initials of their names Goldfarb, Mosher, and Lorie. Later Goldfarb introduced the term general markup language to reflect the use of the GML better. In 1986 ISO adopted it as SGML (Standard Generalized Markup Language). SGML is not an actual markup language rather it provides a specification for other languages, for example, HTML is used to specify tags for web pages, is an application of SGML.

Although HTML was so much popular from the start but there were many problems which were creating headaches for the web developers especially. There were some loopholes in the presentation which were left open on the discrete of end users, on the contrary, page designers always wanted to solidify their design. The completion between Internet Explorer and Netscape was causing differentiation in the standards and was creating more problems for web developers. These problems and lack of standard were causing the original idea about web pages and usage of web services to drift away because the interpretation of web content was no longer uniform.

Because of these problems, it was widely accepted that HTML is too limited and SGML is too complicated, so there was demand for something much better. In 1990’s a large group of people well known in the industry collaborated and came up with the XML. The collaboration was done using emails and teleconferences. The XML stands for Extensible Markup Language. Basically, it is also the specification for other markup languages. It was developed with the aim of general purpose usability and stability, conciseness, formality and minimum optional features. Its internet application is only one form of its usage. With the introduction of XML now it is easy to specify and store any data, which can be imported and processed by any application using any platform.

Uses

There are wide varieties of uses of XML. Some of these uses are as follows.

General application of XML is that it provides a standardized platform to store, access and display data.

While doing web searches and web tasks, XML makes it easier to get desired results because XML defines the kind of data store in the file.

The most popular use of XML is in web development. With XML it is easier to develop an interactive and wide variety with the option to the customer to customize. While data is stored once in XML, which can be used to present to different users with the different style of viewing and processing.

XML use is also getting popular in the storage of data among enterprises. With the use of XML, it is easy to exchange data across different platforms. A business process is getting connected across global scale more than ever. If data hubs are using standardized XML, it will allow business to interact with each other and with customers with ease. Many industries have created systems for storing information in the standardized form using XML for better interaction within the industry. Finance, health care, sciences, and music industry are just to name a few who are using XML for standard data storage. For example in publishing industry XML is used across all document publishers, for example, XML is the basis for Microsoft’s as well as Google’s application as well.

Layout

There are two methods to layout data of any XML document. These layouts are DTD (Data Type Definition) and XML schema. DTD is basically an extension of SGML and XML schema is written using XML syntax. There are some limitations in DTD such as it treats all data in a document as a string and does not have the ability to specify specific rules to specific data. For some application, it is not useful that is why XML schema language was introduced.

You can use our online XML Converter to convert XML files into other formats or you can also convert your PDF files into XML using our PDF to XML Converter.