lien_Zed Posted February 19, 2009 Share Posted February 19, 2009 Save a magazine today For a number of weeks now I have been scanning in some old “Rainbow Magazines” produced by Falsoft in the USA for the Tandy Color Computer 1,2,3 I have been producing full colour pdf files from these in an effort to save these books One of the reasons behind this is the fact the pages are turning from white to yellow, im sure ti wont be much longer and they will be highly unreadable So I was faced with a challenge a few weeks ago as to how to do this on my native operating system, Linux, infact, Linux Mint to be exact http://www.linuxmint.com/ I went for a bit of a hunt and found a program called gscan2df http://gscan2pdf.sourceforge.net/ gscan2pdf allows me to easily scan my any scanner recognised by linux and piece together many pages to produce a pdf book at the end I get to do it all for free and get a few editing options along the way The main one I use is page rotate 180 degress With a little tweaking I have discovered that if I crank up the contrast and brightness when scanning I can pretty much blow away all the background of the pages and eleminate all the yellowness of the pages and not get any bleed thru of the written pages behind either effectivaly I end up with a full colour scan with a bright white background this is great as when I save the pages I want as much o the background gone as possible for a number of reasons, firstly too much background and the scan looks like crap, second it also makes the file size smaller I usually scan about 10 to 20 pages and then save my work and keep going till the book is completed when saving files, I have found that I can produce the best possible pdf by saving using jpeg compressing at 85% quality from a 300 dpi colour scan. I will wind up with a directory that will look something like this book.part1.pdf book.part2.pdf book.part3.pdf these will all contain the books pages but in 3 different files and I want it as one monster pdf this is easily achieved from shell (command line) in linux like this pdftk book.part1.pdf book.part2.pdf book.part3.pdf cat output final.book.pdf verbose this command tells the program pdftk to grab the 3 files book.part1.pdf and book.part2.pdf and book.part3.pdf and join them all together page at a time and produce one big pdf called final.book.pdf the verbose command at the end tells the program you want it to pint as much info on the screen as this happens so you can see what is going on. Dont need it, but its nice to see whats going on now, should you happen to put all these pages together in to one big file and find that one page is upside down, like what happened to me, dont panic here is a very simple shell command to run to repair this, you just need to find the page number that is incorrect for example the other day I have a single page, page 21 upside down and needed to be rotated this is easily done from shell like this pdftk final.book.pdf cat 1-20 21S 22-end output corrected.pdf verbose this tells the program pdftk o get the file “final.book.pdf” and add each page of 1 thru to 20 then get page 21 and rotate it 180 degrees and then grab the page 22 right thru to the endo the fle and add them one after the other with no change and it works a treat in a few seconds hope this may encourage a few of you with old computer manuals, magazine etc to scan them to pdf for future preservation. Naturally go hunting first thru forums etc to see if they have not been done so already. Also be aware for many magazines you need to get permission first to convert to pdf if you want to up load to the net. If you want to save them for your own records, then by all means do so. Hope this may interest a few of you to get scanning and preserving those documents! Link to comment Share on other sites More sharing options...
lilcuda Posted February 19, 2009 Share Posted February 19, 2009 Thanks for the tips , I have a heap of Linux Format mags I would like to scan but they just take forever to do . Link to comment Share on other sites More sharing options...
lien_Zed Posted February 20, 2009 Author Share Posted February 20, 2009 im in the process of scanning a box full of magazines the box stands about 50 cm high and its wide enough to fit two rows of book in it i have books dating back to 1983 and most are complete years get a bit done each night Link to comment Share on other sites More sharing options...
headkaze Posted February 22, 2009 Share Posted February 22, 2009 Dez happy to host those mags over on RetroBytesPortal if you like. The only problem at the moment is we have a upload limit so in order to put large files into our download section we must add them to the SQL database manually. Another good reason for doing this is when digital paper is perfected and available at a reasonable cost. I really like the idea of having 100 mags on a micro sd card and being able to read them on a single sheet of digital paper. Let's face it reading a pdf off a monitor is not exactly my idea of comfortable reading. This technology is going to revolutionise the way we read. Link to comment Share on other sites More sharing options...
lien_Zed Posted February 22, 2009 Author Share Posted February 22, 2009 cool... over the past month or two i have scanned in about 6 gigabyte worth of magazines and mauals for the coco we already have our own ftp site, but when im thru with the scanning i will shoot you an email and i will get the data to you for sure. with all the magazines etc thats been scanned there has already been permission granted to us to do this preservation work so theres nothing to fear i have joined up with a team in the usa and others round the globe who are scanning stuff for the tandy colour computer 1,2,3 hence the new icon as well in my sig ;) Link to comment Share on other sites More sharing options...
lien_Zed Posted May 12, 2009 Author Share Posted May 12, 2009 Dez happy to host those mags over on RetroBytesPortal if you like. The only problem at the moment is we have a upload limit so in order to put large files into our download section we must add them to the SQL database manually. Mate, we have *ALMOST* finished the archive. we have 11 magazine left to be scanned. we have every magazine from 1982 till it finished in 1993. i have added a lot of them myself at 300dpi full colour. at the moment the collection takes up about 9 single layer dvds. when done i will let you know. in the mean time i have been busy scanning many australian tandy coco magazines & others and putting a large dent in my collection. thanks god for my hp6300c from dmworking from the sydney dump or id never get thru them with my canon lide20 usb scanner also, while talking about scanning i have a number of books that are bigger then A4 page length and no A3 scanner, but they are not an A3 size though so to scan the pages and rejoin them. take two scans of the page put the first say 2/3's of it in the first scan and save it then move the book round so you get the remainder of the page and save again now to join the pages together you need a photo stitcher theres 2 for free i know of first is http://hugin.sourceforge.net/ this is easy to install from ubuntu shell, type in sudo apt-get install hugin thats it next is autostitch http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html let it join the pictures as jpeg then you can convert to pdf and into a mag in one swoop with pdftk hope this helps anyone else who might be converting old mags to pdf Link to comment Share on other sites More sharing options...
lien_Zed Posted May 19, 2009 Author Share Posted May 19, 2009 he redone this and added instruction for converting to djvu The secret to high quality scanning by Dez the Coconut in Australia Ok folks, there has been much talking about the scanning that everyone has done and there is a number of scans for the rainbow magazine. There is some ok quality, good quality and pretty amazing quality. Here's how I do my scanning and hopefully someone might learn from my experiences thus for from the last 5 months First off, start off with a copy of Ubuntu Linux from http://www.ubuntu.com you can download it via torrent or ftp or even request a free cd copy of it or hunt and see if there is a Linux user group and maybe someone there will have a copy they will gladly let you have a copy of the choice is yours if you want to install ubuntu to internal hdd or usb memory stick or what ever. Theres tons of ways to do it. Help is at http://www.ubuntuforums.org or you can email me. Im no expert at it but I have installed it to external hdd, external thumbdrive, dual boot off one hdd and dual hdd too. And for what we want to do you could even argue the fact that you dont even need to install ubuntu anyhow so once ubuntu is installed, in using 9.04 currently on one and 8.04 on the other you need to fireup shell shell is the command line, just like OS9 or messy, oops , msdos, only shell is much more powerful with a bit of luck you can plug your usb scanner in and ubuntu will recognise it and were in business. If it doesn't work and you need to install drivers.... don't ask me. I have no idea. Ask at http://www.ubuntuforums.org ok, now we need to install the scanner software that I use its called ?gscan2pdf? it is an open-source bit of gear and it does exactly what I need standing on is head its homepage is here http://gscan2pdf.sourceforge.net/ to install it from shell type in sudo apt-get install gscan2pdf while your at it , install pdftk as well, its the PDF slicer/dicer/boxer I use to manipulate all my PDF files with sudo apt-get install pdftk and we will also need pdf2djvu so once we have our high quality PDF we can convert to djvu at 400 dpi and save mobs of space and have very detailed documents as well soa gin from shell we type in sudo apt-get install pdf2djvu ok well that's all the tools we need lets go scanning fire-up gscan2pdf and click the scan button. With a bit of luck your usb scanner will be auto-magically picked and you will see it and some settings to change my main scanner im using is a HP scan-jett 6300 with 25 sheet adf ? automatic document feeder ? like a fax machine for the ones whodon'tt know what an adf is. I can select the speed I want to scan at I always select the fastest next is resolution I always select 300 next is the mode of scan we have line-art half-tone grey-scale colour line-art is essentially a black and white scan with very little difference in the colour of black/grey this is great to use on pages that are essentially just plain black. DONT USE THIS MODE IF PHOTOS ARE ON THE PAGE. They look horrid in this mode. This mode takes up only a little space. half-tone will take a very dark black original piece of paper and when scan turn it in to a very dull looking grey picture on your pc. Who knows they you will ever need this mode for. Grey scale. Use this mode if you have a black/white mag or newspaper and there's photos on the page. This mode will give you quite good looking b/w reproduction and looks magic.this mode takes up a fair-bit of space, but not as bad as full colour full colour.... well I think this is self explanatory ok so here's how I crank out a scanned magazine scan about 10 to 20 pages and then save now when saving there is a number of options to save your scanned pages youuu can save aindividualll page or save the entire lot and you have the choice as to wehter or not you use jpeg or a feotherer formats as well. From my experiments I have found that when saving to PDF I use the compression method of jpeg. Jpeg is a ?lossy format? so to combat the loss of quality I save at 84% quality. If I save at 85% quality the file size jumps to amazing proportions, so I leave it at 84% so repeat this scanning process for your book and you will wind up with your data saved in a directory and file names that look like this my.magazine.part1.pdf my.magazine.part2.pdf my.magazine.part3.pdf my.magazine.part4.pdf my.magazine.part5.pdf ok so for argument sake we will say that each file was 20 pages in it and is 20 megabyte long so when all joined together we will have a single PDF 100meg long and all pages in numerical order so to archive this we fire-up shell again and go in to the directory where we saved our files and fire-up pdftk http://www.accesspdf.com/pdftk/ pdftk lets us do all sorts of groovy stuff to PDF files. We are going to use it to join our individual files together and make one. It will do this standing on its head. It has numerous groove features, but I wont go in to that here ok so from shell we type in pdftk my. And now e press the tab button on our keyboard and like magic we will now have in front of our very eyes pdftk my.magazine.part told you shell was powerful. It scanned the directory and magically added in the ?magazine.part? for us now we press 1 so we now have pdftk my.magazine.part1 now we press tab button again and we now have in-front of us pdftk my.magazine.part1.pdf smart ain't it. For very little time/effort we are getting our files in so no we repeat the process of hitting tab and pressing 2 3 4 5 when needed and in a matter of seconds we scan type in this command line pdftk my.magazine.part1.pdf my.magazine.part2.pdf my.magazine.part3.pdf my.magazine.part4.pdf my.magazine.part5.pdf lets see you type this in on a standard windows command line and see how slowly you do it. LOL! Next we need to tell pdftk that we are going to join all files together in to one giant file so we add this cat output my.magazine.pdf verbose this is the bit we add to the end of what we typed so our entire command line will read like this pdftk my.magazine.part1.pdf my.magazine.part2.pdf my.magazine.part3.pdf my.magazine.part4.pdf my.magazine.part5.pdf cat output my.magazine.pdf the verbose command at the end tells shell to echo to the screen what the program is doing. This saves you guessing what the heck is going on. Cos if you don't do this when you press enter you get no feedback from the program so press enter and watch the pages fly by. In a matter of seconds you will back at your command prompt with a flashing cursor. No look at the directory and you will see the final bit of gear called my.magazine.pdf open it up with document viewer and scroll down and admire the 100 page document you have pieced together. Now.. have a lookie at the file size. I get it will be around 110 megabyte or perhaps a little more. so. now to convert this to djvu format and keep the high quality pages but lose the file size and make it smaller so fire-up shell and type in pdf2djvu -o my.magazine.djvu -d400 -v my.magazine.pdf ok to explain this a little we have told the program that the output file will be called my.magazine.djvu we told it we want it to compress using 400 dpi with this -d400 we told it we want to see some output to the screen so we know that the heck is going on -v and we told it the name of our original file is my.magazine.pdf now press enter. You will see something like this aussie.coco.june.1988.pdf: - page #1 -> #1: - image size: 3199x4332 - 353010 bytes out - page #2 -> #2: image size: 3199x4332 (I have deleted many pages out of this bit) - 341857 bytes out - page #76 -> #76: - image size: 3167x4332 - 450144 bytes out 0.210 bits/pixel; 3.858:1, 74.08% saved, 105702515 bytes in, 27394816 bytes out so you get the idea. Now look in the directory and you will see your .djvu file and your original PDF pieces and your final PDF. Ditch the files that are the .part1.pdf stuff and keep the pdf and djvu files. Dont destroy the PDF files. PDF file originals are easier to work with then djvu files. So do any editing to the PDF and then remake the djvu file. Piece of cake. Hope this might help someone else who wants to do their own scanning Link to comment Share on other sites More sharing options...
Chargin Posted May 20, 2009 Share Posted May 20, 2009 I read most of this thread but not all so I might have missed you say this already. Big tip for scanning any mag is put a black piece of paper behind the page you are scanning. This way theres zero "bleed through" of the text or images from the other side of the page being scanned. Upping the contrast etc will help give a good clear white page when scanned. Im sure you guys already know this though. Link to comment Share on other sites More sharing options...
lien_Zed Posted May 20, 2009 Author Share Posted May 20, 2009 I read most of this thread but not all so I might have missed you say this already. Big tip for scanning any mag is put a black piece of paper behind the page you are scanning. This way theres zero "bleed through" of the text or images from the other side of the page being scanned. Upping the contrast etc will help give a good clear white page when scanned. Im sure you guys already know this though. no need for the extra paper. crank up the brightness/contrast and it will be bright enough to blowaway the bleed thru from the pages behind by default my scanner brightness/contrast is set to 0 in the s/w i set them to 30 for both and i get excellent results everytime this will also help get rid of the yellowness out of pages as well that have colour photos on & you scan in colour i have scanned many thousands of pages since janurary like this and it works a charm and looking at the pile in front of me i reckon i will still be scanning come janurary 2010!!! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now