The International Journal 
of Newspaper Technology

Home  | Newspapers & Technology | Prepress Technology | Online Technology | IFRA/WAN/International News
 | Free Subscription | Contact Us | Newspaper Links | Trade Show Listing |




Nov.

2006





 



 

 

 

 

 

 

 


 

 

 


 

 

 

 

 

 

 



 














 

 

Post-Dispatch automates agate with home-grown scripting tool

By Tara McMeekin
Editor

 

The St. Louis Post-Dispatch is cutting the time formerly used to manage agate copy by rolling out a home-grown app that automatically formats the tiny typeface, regardless of the source.

Using a script developed by Newsroom Technology Director John Hurst and News Administration Director Charles Arms, the newspaper (daily, 277,842; Sunday, 423,291) can process agate in seconds, all with the push of a button.

Hurst said the paper is deploying the tool one section at a time, in conjunction with its implementation of MediaSpan’s Jazbox editorial app, which uses Adobe InCopy for word processing and InDesign for page design.

 

The script has already been employed for its sports agate and other features, and Hurst said it will be used in the business and news sections as well.



Agate copy as it appears before and after running through the functions of the Post-Dispatch's home-grown scripting app.
Graphics: St. Louis Post-Dispatch

 

Mixture of VB

The script, a mixture of Visual Basic and the InCopy/InDesign document object model, runs under InCopy, Hurst said.

“We take a sports agate file or a recipe or a movie listing, for instance, from any source and the script sets the file up for use on an InDesign page. It adds all the attributes - rule lines, style, extra leading, tabs -and stylizes the file to get it ready for the paper.”

What makes the approach unique is that the script analyzes each line of agate individually to determine what rules to apply, rather than forcing someone to hand-code or use macros, Hurst said.

The script analyzes the copy one line at a time, beginning with the first, where it finds a keyword or code that determines what to do with the file.

“If the keyword is ‘NFL longbox,’ then we run the appropriate modules in the script to stylize and code up that box,” Hurst said.

Once the script determines what type of file it’s dealing with, it examines the second line, which goes through an array - an ordered arrangement of data elements - created by the programmers that contains the breakdown of the file, including all the unique lines that make up the file.

“If you were to take a file and look at each line separately, you would say ‘OK, the pattern of these five lines matches the pattern of these three lines, matches the pattern of this one line, etc.,’” Hurst explained.

The Post-Dispatch’s scripting app relies heavily on the use of regular expressions, which work by pattern matching.

 

Matching patterns

“Instead of looking to see if a line contains the word football, we’ve written it so that it maybe looks for a capital letter F followed by lowercase letters and ending in a lowercase l,” Hurst said. “We’ve set the array up in such a way that if something matches a certain pattern, there are things we know need to be done to that pattern.”

Once a pattern is matched, the array performs the defined steps and moves on to the next line until it has moved through every line of copy.

Arms said tabs were among the Post-Dispatch’s most trying style issues for agate copy, and an area where the scripting tool has been key.

“A tab has a lot of attributes to it - right or left, leader lines, where its position stops in a column - so if you want to assign tab attributes to a style you might literally end up with 500 styles,” he explained. “We have to have a style for every set of stock values and there might be 10 lines where the typography is exactly the same, but if the tabs were in different places that would require different styles.”

That proved a major obstacle for the paper. Because even a simple box score might contain five sets of tabs, the number of different styles governing those tabs would be overwhelming, Arms said.

 “You can imagine how many styles we would have just for agate,” he said. “Now we have one agate style and by scripting we can set tab-stop values and other attributes of the tabbing. It works very well and keeps our style palette clean.”

 

Abandoning macros

The scripting tool also let the Post-Dispatch abandon the use of macros formerly needed to manage agate.

“That entails finding a word and then jumping down so many lines, and it’s very mechanically driven,” Arms said. “That concept was around 20 years ago and it’s basically just memorizing keystrokes.”

Unfortunately, Hurst said, that is the method most papers are dealing with.

“The model we’re applying here is - as far as I know - not done anywhere else,” he said. “Most papers are hand-coding copy or using macros, which consists of recording keystrokes. So you’re saying ‘go down five lines, find this word and put a colon on the end of it - and if that word’s not there, it can’t do what it tried to do.”

Identifying lines by regular expression is another benefit.

 “The beauty of identifying the lines by regular expression, is that for example, with something simple like a major league baseball box score, on any given night, the nine starters may play the whole game and then on another night they may have five different guys pinch hit, so you never know how many lines are going to be in that box score,” Arms said.

“You can’t use automated cursor movements, you literally have to know when something starts and something else stops. By having a regular expression pattern that represents each line, we just move along and we know exactly when it’s changed regardless of how many lines are in the agate file.”

The scripting tool has allowed the Post-Dispatch to automate several styles unique to its paper - such as automatically converting time references in AP wire stories from Eastern to Central, formatting celebrity birthdays, which the Post-Dispatch handles differently than many other papers that simply run the AP copy, identifying winners and losers of pitching match-ups at the bottom of baseball box scores, and creating unique overlines in NFL box scores.

The daily wrote in a special function to create the overline in NFL boxes using team nicknames, based on the city name provided by AP and the lines identified as the score.

“When we send ‘St. Louis’ over to functions it brings back ‘Rams’ and we can automatically parse from that data the winner, the loser, supply the nickname and write out that line,” Arms said.

These features previously required an agate copy clerk to enter the information by hand and took several  minutes. Now they take between 5 and 20 seconds, Hurst said.

“Having the one-button functionality is huge, but then the bigger benefit is that the (copy) is so close to being publication-ready when this thing runs,” Arms said.

The process is instantaneous and the paper boasts a 99 percent success rate with the app. Typical corrections include things like shortening names that are too long for a particular line in score boxes, Arms said.

When errors are identified, they are immediately corrected so that the same one never occurs twice.

“We almost amaze ourselves in that we’ve had so few errors,” Arms said.

Although Visual Basic anchors the script, for which Hurst and Arms have written more than 15,000 lines of code, Hurst said the app could have just as easily be written in Javascript as well.

“It’s just a script that goes underneath one of the InCopy folders and once you put it in the folder, InCopy sees it and then you can assign it to a function key or run it from the Script palette in InCopy,” he explained.

 

Showing off script

Hurst and Arms will be showing their technology off to other newspapers at the next MediaSpan User’s Group, scheduled in February 2007.

They both agree that this tool could be even more widely implemented at the paper - in any copy that contains recognizable patterns since the Visual Basic script allows manipulation of any copy that can be recognized by a pattern.

“What makes this unique is the way our model works,” Hurst said. “Because of the pattern matching we can apply this any place that we want to automate something.”