IMUG Past Events Archive: 2007
2007 Events
- Nov: OpenOffice.org Internationalization (I18N) Framework
- Oct: Collaborative Website Translation via the Worldwide Lexicon
- Sep: Challenges of Searching the Global Internet
- Aug: How Acquisitions Affect I18N and L10N Teams: A View from the Client Side
- Jul: Meedan: a Project for a World That Doesn't Talk Enough - Hybrid Distributed Natural Language Translation and a Visual Social Media Browser
- Jun: A Photographic Journey to Luang Prabang, Laos and Angkor Wat, Cambodia
- May: Language Support on the One Laptop Per Child XO Computer
- Apr: Challenges in Machine Translation
- Mar: Getting it Right the First Time: Streamlining your software localization and avoiding costly mistakes
- Feb: How to Be a CSI
- Jan: The Localization Maturity Model
Want to Speak at an IMUG Event? Want to Suggest a Speaker?
To propose a presentation for an upcoming IMUG meeting, to suggest a speaker for us to invite, or to get information on how to prepare, please see Information for Presenters.
2007 Event Announcements:
OpenOffice.org Internationalization (I18N) Framework
Speaker: Karl Hong, Sun Microsystems, Inc.
OpenOffice.org is a multilingual and multiplatform office suite, available on Microsoft Windows (from 98 up to Vista), GNU/Linux ("Linux"), Sun Solaris, Mac OS X (under X11), and FreeBSD. It supports 100+ languages and is compatible with major office suites such as Microsoft Office.
OpenOffice.org is an open-source project. The product is free to download, use and distribute.
The OpenOffice I18N framework offers full-featured I18N functionalities that support:
- West and East European languages
- East Asian (CJK) languages
- Complex Text Layout (CTL) such as Thai, Hebrew and Arabic
This I18N framework is built over the component model UNO which provides the scalability to add new I18N components with minimum efforts.
Karl Hong is a staff software engineer at Sun Microsystems, Inc. He has more than ten years of experience in the internationalization and localization area. During his ten-year tenure at Sun, Karl has worked on a number of projects including Solaris Chinese language support, Java internationalization on Solaris and Netscape localization. He is the major contributor and the owner of OpenOffice.org I18N framework. His current responsibility is to provide internationalization consultation to various Sun product development groups such as storage, Ldoms (Logic Domains).
Collaborative Website Translation
via the Worldwide Lexicon
Speaker: Brian McConnell, Worldwide Lexicon Project
The Worldwide Lexicon Project (www.worldwidelexicon.org) is an open source, collaborative translation system for websites and publishers. The current version of WWL, available now for Word Press, Firefox and PHP based sites, enables a website's readers, as well as volunteer or staff translators, to create, edit and share translations to and from almost any human language.
The system relies entirely on people to contribute and revise translations. WWL is tightly integrated with publishing and content management systems such as Word Press, and displays translations inline with the original text, as well as via separate websites. The system has been in public beta for several months, and has logged users from over 110 countries. The project's long-term goal is to make collaborative translation universally accessible to readers and publishers, by embedding it in a wide range of publishing platform and web services.
Brian McConnell, the project leader, will demonstrate WWL on several publishing environments, discuss the current status of the project, and its future development plans.
Brian McConnell is a telecommunications developer, author and entrepreneur. He has founded three telecom businesses since arriving in California in the mid 1990s. Brian is also a frequent contributor to O'Reilly & Associates, where he writes about software development, technology and society. He is currently the Vice President for Advanced Product Design at Virtual PBX Inc, which acquired his most recent company, Open Communication Systems, a developer of telecommunications software, earlier this year.
Challenges of Searching the Global Internet
Speaker: Peter Linsley, Ask.com
Search engines work hard to deliver on the simple promise: to take your keywords, sift through several billion documents and come up with a small handful most likely to answer your query. With a focus on language, character set, and cultural issues, the presentation will cover some key challenges in crawling, indexing and making documents of the world searchable.
Peter Linsley joined Ask.com in 2005 and currently works as a Technical Product Manager in the Search Technology group.
Prior to Ask.com, Peter worked as an engineer at Oracle Corporation in Japan and Silicon Valley where he worked on internationalization of the database and was also responsible for bringing globalized regular expression support to Oracle SQL.
Hailing from North East England, Peter now resides in the sunny half of San Francisco.
How Acquisitions Affect I18N and L10N Teams: A View from the Client Side
Speaker: Carrie Fischer, Oracle Corporation
Acquisition is a common way to grow a company's revenue base, enrich the current product line, eliminate competition, or all of the above. There are practical issues for the acquired company's Development team to address, such as process change, new file type requirements, language additions. But what about changes in corporate culture, political issues, and the effect on teams that don't have a home in the new organization? This presentation will address the practical changes that are required of the newly acquired company, especially its effect on employees specifically in the localization and internationalization teams. We've seen how acquisition affects the vendor side, now let's take a look at the client side.
Carrie Fischer graduated with a French degree from Northern Illinois University. She worked for a then start-up company in New Hampshire called Transparent Language for 8 years, and she has worked for Hyperion Solutions as the director of Localization for 9+ years. She now works for Oracle as the Director of Program Management. She co-wrote 3 articles for IWIPS on the importance of designing global products and reaching global audiences (under her maiden name Carrie Livermore).
Meedan: a Project for a World That Doesn't Talk Enough- Hybrid Distributed Natural Language Translation and a Visual Social Media Browser
Speaker: Ed Bice, Founder & CEO, Meedan
Ostensibly, this is going to be a talk about a hybrid approach to language translation. We will, however, spend a bit of time looking at the larger Meedan project, which can be roughly understood as a design-heavy effort to lure the world into talking to itself by creating the iPod of cross-language global social media browsers. (No, this is not the official branding line, but perhaps it will draw a crowd.) Meedan's open beta is scheduled to release in early 2008 with English/Arabic chat and batch translation capability. We are currently working with IBM researchers on a significant effort to train an MT engine on Arabic and English chat and IM.
HDNLT is a new approach to language translation – reputation-based, with redundant distribution and re-assembly of text fragments using a mixed network of human and machine translators. High-quality translations are obtained by marshaling the resources of a large number of intermittently available translators with varying levels of competency. HDNLT is useful in cases where machine translation is unreliable (or even non-existent), and especially in cases where the discourse in dialogues or documents is colloquial, dialectical, and informal. The basic principles of HDNLT are language-independent. The approach is consistent with emerging trends in so-called "Web 2.0" applications, where overall value arises from small, shared contributions that are combined using reputations adjusted by performance measures and user feedback.
Ed Bice has a BA in Philosophy from Carleton College, where he was fortunate to study with the late Paul Wellstone. He founded and led an environmental residential design company in Teton Valley, Idaho, for twelve years before he and his family moved to San Francisco, arriving on September 11, 2001. He has been working on various aspects of the Meedan project since 2003. Meedan is supported by IBM, MacArthur, Ford, Cisco, and Nathan Cummings, among others.
A Photographic Journey to Luang Prabang, Laos and Angkor Wat, Cambodia
Speaker: James Turley, Sky Image Lab
James Turley will present a hi-res photographic travelogue (with some linguistic content) of his recent travels to the mountains and jungles of Laos and Cambodia. In Laos, he was fortunate to arrive at the beginning of "Lao Pi Mai" Buddhist New Year Water Festival. You're sure to enjoy the climax of the festivities, with the crowning of Miss Lao New Year in the historic Royal City of Luang Prabang.
From there, follow him down the Mekong to Siem Reap, Cambodia, the gateway to the stunning temple complex of Angkor Wat. Explore the strangled temples of the hidden city of Angkor Thom adorned everywhere by lovely Apsaras, mythical temple fairies. Examine ancient Sanskrit and Khmer inscriptions on these old temples.
James Turley runs a small localization consulting company specializing in obscure Asian locales. At night, he runs an Internet site publishing astronomical images from the great observatories of the world. James loves to travel, and when he's not traveling, he's planning his next trip.
Language Support on the
One Laptop Per Child XO Computer
Speaker: Ed Cherlin, One Laptop Per Child
The XO laptop (the first computer to run Free Software exclusively, including BIOS and drivers) is going into six countries this year, and many more in the future, requiring support for many languages. After a brief demonstration of the XO laptop's general features, we will look at built-in language support, localizations of Linux currently in development, and future requirements. If time permits, we can discuss the implications of all this for language technology, the end of poverty, and the future in general.
Ed Cherlin has been a mathematician, Peace Corps volunteer, Buddhist priest, market researcher, software developer, writer, and editor. He is currently volunteering with One Laptop Per Child and organizing a new non-profit, Earth Treasury, to connect schools around the world for shared education and business opportunities. He has previously talked to IMUG about language support in Linux.
Challenges in Machine Translation
Speaker: Dr. Franz Josef Och (Senior Staff Research Scientist, Google, Inc.)
In recent years there has been an enormous boom in MT research. There has been not only an increase in the number of research groups in the field and in the amount of funding, but there is now also optimism for the future of the field and for achieving even better quality.
The major reason for this change has been a paradigm shift away from linguistic/rule-based methods towards empirical/data-driven methods in MT. This has been made possible by the availability of large amounts of training data and large computational resources. This paradigm shift towards empirical methods has fundamentally changed the way MT research is done. The field faces new challenges.
For achieving optimal MT quality, we want to train models on as much data as possible, ideally language models trained on hundreds of billions of words and translation models trained on hundreds of millions of words. Doing that requires very large computational resources, a corresponding software infrastructure, and a focus on systems building and engineering.
In addition to discussing those challenges in MT research, this talk will also give specific examples on how some of the data challenges are being dealt with at Google Research.
Franz Josef Och joined Google in 2004 as a research scientist, where he leads the machine translation group. He has been working on statistical machine translation since 1997.
Franz worked as a Research Scientist at the Information Sciences Institute at the University of Southern California from 2002 to 2004. His main research interests are statistical machine translation, natural language processing and machine learning. He has co-authored more than fifty scientific papers and has written several open-source software packages related to statistical natural language processing.
He received a PhD in Computer Science at the RWTH Aachen, Germany in 2002 and his Diploma Degree in Computer Science at the University of Erlangen-Nuremberg, Germany in 1998.
For more information on the speaker, please visit http://www.google.com/research/pubs/och.html
Getting it Right The First Time: Streamlining your software localization and avoiding costly mistakes
Speaker: Luciano Arruda (viaLanguage, Inc.)
With global markets currently dominated by emerging economies and a generally good business climate internationally, the allure of new international customers is difficult for most companies to ignore. In order to meet that demand, organizations are starting to take a closer look at ways to make more out of their existing software localization budget and cover a wider range of markets. How many new languages could you add to your current release projects without increasing your team's head count? To meet those types of demands, high-performance localization teams take advantage of the latest tools and technologies to increase their efficiency and improve their quality. But what does that look like?
A large portion of the cost associated with software localization is related to your globalization process, the tools you use and how your localization team/partners interact. In theory, localization costs should decrease as you move from one localized product release to the next since, with successive releases, product management teams gain experience and adopt more efficient tools and processes. The team should become increasingly productive! In reality, it doesn’t always work that way. In fact, sometimes just the opposite occurs. Localization expenses seem to go up with each release without much improvement in quality.
This 45 minute workshop will explore these topics:
- Major strategic components and project milestones for software localization projects
- How to implement more efficient processes and make better use of localization vendors
- Most common mistakes made in software localization projects, and how to avoid them
- Information on technologies and services that will help your organization become more streamlined
The speed of globalization will only increase in the coming years. By being armed with the appropriate processes and tools, your localization efforts can be streamlined to take advantage of the those opportunities as they present themselves, keeping you ahead of the curve -- and ahead of your competitors.
Luciano Arruda (Sales Engineer, viaLanguage) is the software account liaison for viaLanguage supporting client localization needs and integrating customized solutions for software localization projects. He has 8 years of experience in the software localization industry working for companies like Macromedia, Looksmart and Adobe in Program and Project Management, Localization Engineering and Quality Assurance. Born in Brazil, Luciano is fluent in Portuguese and speaks Spanish. He holds a B.S in Computer Science from Cal State Hayward University.
How to Be a CSI
Speaker: Tex Texin (Yahoo! Inc.)
Join Tex and the CSI team, Grissom, Willows, Sidle, Stokes, et al. in the forensic analysis of character encoding crimes. This presentation will elaborate on the techniques of CSI forensic analysis and its application to debugging character encoding problems in software and web applications. Several example problems will be diagnosed.
This presentation was recently given at the Unicode Conference in Washington DC.
Tex Texin (Internationalization Architect, Yahoo! Inc.) has been providing globalization services including architecture, strategy, training, and implementation to the software industry for many years. Tex has created numerous globalized products, managed internationalization development teams, developed internationalization and localization tools, and guided companies in taking business to new regional markets.
Tex is also an advocate for internationalization standards in software and on the Web. He is a representative to the Unicode Consortium and has been an invited expert to the World Wide Web Consortium.
Tex maintains two Web sites for internationalization, the popular http://www.I18nGuy.com and a site focused on the Progress Software community and others: http://www.XenCraft.com.
Localization Maturity Model
Speaker: Don DePalma (Common Sense Advisory, Inc.)
Localization is a black art to some companies, a well-defined process to others, and a continuing journey for most. Because many organizations will pass the same milestones on their way to localizing their wares or their communication channels, Common Sense Advisory decided that it is time to document those landmarks. Don DePalma will review the firm's recently issued report on this topic.
He will discuss localization using capability maturity model (CMM) frameworks pioneered by Carnegie Mellon Universityfs Software Engineering Institute (SEI). The firm's description of localization maturity model (LMM) behaviors is based on their four years of research at Common Sense Advisory and on many years of prior research, observation, and market participation at previous firms as consultants, advisors, and providers.
The goal of their LMM research is to identify the metrics and key process areas (KPA) for graduating from one phase to the next. Specific recommendations are offered on how to discover, analyze, and improve the process, organizations, and technology used to transform software, websites, and other content for global or domestic ethnic markets.
This talk will highlight the main points in the report including: 1) an overview of the organizational, process, and technology lifecycles of localization; 2) a discussion of the behaviors, processes, and activities that constitute defined, managed, and repeatable best practices; and 3) recommendations for moving up the maturity ladder.
Don DePalma (President and Chief Research Officer, Common Sense Advisory) is an industry analyst, author, and corporate strategist with expertise in business- and marketing-focused application of technology. He lectures, writes, and is frequently quoted on the topics of online marketing, content management, multicultural marketing, localization, return on investment, and website globalization. His book, "Business Without Borders: A Strategic Guide to Global Marketing," is widely used in universities and in business training courses.
Get IMUG event announcements, sign up for reminders and RSVP via Meetup, the world's largest network of local groups. IMUG events are usually $5, but free for members, volunteers and hosts.
Join the multilingual computing community! Admission to most IMUG events is free for members. Membership is only $20, annual renewals are $15, and lifetime membership is $100.