Category Archives: internet

Meiroku zasshi (明六雑誌) now available online

The Meiji periodical founded and written by Fukuzawa Yukichi and others, Meiroku zasshi 明六雑誌, has now been put online in full text – or rather, page images. They’re available in both JPG and PDF format. This is a great resource for Meiji researchers, as it’s not exactly easy to get ahold of this 1874-1875 periodical otherwise. And let me tell you, these are high quality color images, highly readable, and you can even get a sense of the texture of the page. It’s a beautiful digitization and a valuable project.

You can access it at the 明六雑誌画像 website.

New issue of D-Lib magazine

D-Lib magazine has just published their most recent issue, available at

This looks to be a great issue, with a number of fascinating articles on dissertations and theses in institutional repositories, using Wikipedia to increase awareness of digital collections, MOOCs, and automatic ordering of items based on reading lists.

Please check it out! All articles are available in full-text on the site.

NDL makes public the Historical Recordings Collection digital archive

On March 15, 2013, the National Diet Library made public their new digital archive of historical recordings. In partnership with a number of groups, including NHK, they have digitized and made available recordings from SPs from 1900 to the 1950s, in order to preserve them and prevent their becoming lost.

As time goes on, they plan to hold approximately 50,000 recordings in the archive. Although many recordings can be accessed via the Internet, some are only available to listen at the NDL itself due to copyright restrictions.

You can also access an NDL article on the digitization of recordings, entitled 音の歴史を残す (PDF link).

The archive is the Historical Recordings Collection, accessible at

instagram, photoshop, and publicity rights”

There has been a bit of a furor over Instagram’s new terms of service, in which I unwittingly took part – well, perhaps half unwittingly. I jumped on the bandwangon of outraged Instragram users and posted directions on how to delete your account and backup your photos on my Twitter, before getting the news (also via Twitter) that they’re backtracking on the offending language of being able to give your photos, profile information, geolocation information, and other metadata to advertisers (‘third parties’) for their use, without compensation, presumably in advertising (‘enhanced advertising’ if you will). I seriously considered deleting my account, despite my abject love of the service. As a semi-professional photographer, it’s been amazing for getting my photos online quickly, taking more shots than I would otherwise, and self-promotion. I’d be very sad to have to leave.

Yet some of the furor has been over people worrying that their kids’ photos would be used without their knowledge or compensation, even if they were private photos. I’d like to take this chance to remind people of publicity rights, the right to not have one’s likeness used to  promote products or otherwise, without their permission. This applies to everyone, not just celebrities. So the use of kids’ photos without permission is flat-out illegal and Instagram could be sued for doing so; given this, it’s extraordinarily unlikely that this would ever happen. People worried about kids’ or friends’ or family’s photos have nothing to worry about.

Still, there is some pushback on the part of media companies who want to use your photos as they see fit. (Note also that we all need to be reminded that we still hold our copyrights – what we’re granting is a non-exclusive license, not a copyright transfer, so people need to not be flipping out about this either. You still own your stuff.) Quoted from an article I came across:

Right of publicity laws protect people, both celebrities and everyday citizens, from having their names or photos used for commercial purposes. However, using a person’s name or photo for news reports is not a violation of these laws, according to the Digital Journalist’s Legal Guide , which was produced by the Reporters Committee for Freedom of the Press. 

In fact, Facebook defended its “sponsored stories” as “newsworthy” in the California lawsuit, saying that people’s brand preferences should be considered “news” to their Facebook friends.

The fact that Facebook is arguing that this is “news” is interesting and disturbing. I really hope they lose this lawsuit, because otherwise this would be a massive blow to publicity rights, and thus people’s control over their own likenesses. This is an important right in terms of privacy, one that predates the digital world, and is crucial to people’s sense of self-determination. I am going to be following this story closely, although it turns out that Facebook wants to settle a class-action lawsuit that would give only $10 to each offended individual. That is, in a word, wack.

But for the meantime, worry about services using non-likeness photos, because hopefully Facebook will lose and we will only be left with the serious issue of terms of service dictating non-exclusive licenses of copyrighted material.

What I’d really like to see is a lawsuit involving that, to see if terms of service are actually binding contracts, but I haven’t heard of any court cases of this nature so far. I’d like to hear from my readers who are more knowledgable than I am in this area, and who may have heard of court cases pending that might answer this question for me.

how to confuse amazon

Search for The Culture of Collected Editions and, while the first result is the right book, check out the rest of your odd results! Here are a featured few. Click on the image for full effect.

Also, why is there no choice between 300px wide and 1024px wide for displaying this image with WordPress? All I want is 500px so you can read that “Command and Conquer” is on the list.

digital surrogates and utility

As someone who studies the history of the book, often as an object in itself, my research tends to require that I go look at books in person. However, I use the Kindai Digital Library quite regularly as a way to survey what exists (although I fully realize how incomplete Kindai is), and indeed, I would never have found my research topic without being able to preview books using this digital library.

The point is, I previewed the books using Kindai, and then got on a plane to Japan to actually study the books for my research. I had to locate a physical copy and literally get my hands on it, in order to understand how it was made, what impression it would make on readers, and its intended audience. (For example, how well-made is it? Does it have color illustrations or text? What’s the quality of the paper like? Does it feel or look cheap? How is the binding? None of these questions can be answered from the black-and-white copy in Kindai.)

The history of the Kindai Digital Library is interesting: it’s a digitization project undertaken by the National Diet Library and based in the same collection as the Maruzen Meiji Microfilm: books microfilmed and owned by the NDL. Neither covers the entire collection of Meiji books that the NDL owns, it’s not clear if Kindai and Maruzen are coextensive (to me anyway), and the NDL’s collection does not contain every book published in the Meiji period. So, yes, it has limitations – it’s not every book from the Meiji period, and it’s scanned microfilm in black-and-white, not grayscale.

But the Kindai Digital Library, unlike the Maruzen microfilm collection, is being added to continuously, and out-of-copyright books from the Taisho and Showa periods (1912-1989) are also being scanned and included in the collection. For the newer books, they themselves are being digitized, rather than having microfilm as an intermediate step. Check out the difference between these two books by Wakamatsu Shizuko, published in 1897 (color) and 1894 (black and white):

Sure, there is a big impressionistic difference in seeing a full-color cover illustration versus a black-and-white scan of what used to be a color cover. But you can see from these images that it’s very difficult to tell the quality and condition of the monochrome image, versus the higher-quality color image that captures things like discolorations on paper and the quality of the cloth binding (not pictured here).

This makes all the difference for someone doing my kind of research: if I had scanned copies of the anthologies I study that are as good as the color book above, it’s likely that I could still do decent research – if incomplete – without going to Japan to look at these books in person. With the higher-quality color image, the digital surrogate has become a usable surrogate for me, a reasonable facsimile if you will. It provides me with enough information to be able to draw conclusions about more than just the content of the book.

This matters for more than book historians, however. One reason that Kindai Digital Library is so great is that it provides digital surrogates of the full text of books, not just their covers. Every page that is available is scanned, either from microfilm or from the book itself, and provided for viewing online – and, if you have the patience, as a PDF download a few pages at a time. Yet compare these images, again from the 1897 and 1894 books introduced above. Click to view the full size so you can see the quality of the text in each. They are both at 25% zoom in Kindai’s page viewer.


Here, you can appreciate the difficulty of reading the monochrome text – and this is an exceptionally clear one. The books I have read (with difficulty) excerpts from on Kindai are typically much lower quality and many characters are difficult to make out. Zooming in doesn’t help, because the quality of the image itself is relatively low.

On the other hand, you have the newer additions with higher-quality surrogates such as this color book. Of course, it’s not necessary to have color pages to read a text that was originally printed in black and white, but the inclusion of values other than straight black or white increases readability by allowing for a higher quality image. It also allows for clearer text when zooming out, viewing at say, 33% (a percentage where the monochrome text would look terrible).

As you can see, the point is that the newer Kindai texts are more usable than the older ones, not just prettier. They express the idea that there is a point where a digital surrogate becomes a usable surrogate, where it becomes “good enough” to live up to its name. Of course, “usable” depends on the purpose, but I think we can agree that if “reading” is the purpose, these new scans are far closer to the goal than the old ones.

Kindai should be commended for this commitment to higher quality in new additions to the library; I only wish there were the resources to re-digitize everything in the library at this standard.

Why is it important to? It’s not just because it would be an even more convenient resource for myself and my colleagues, an even more usable one. It’s because of the very real danger of losing some of these books. There are few, if any, copies of many of them left outside of the NDL’s collection, and many of them can no longer be viewed at the NDL in any format other than microfilm. It’s not clear to me whether the originals are being protected from the public, or if NDL actually only owns the microfilm, with the original lost to time at some point. Regardless, for many books, the Kindai scan (or NDL microfilm, its source) is the only copy of the book available. If it’s not even fully readable – the most basic level of utility beyond knowing from search results that it exists – then we have failed in our task of preservation, and in our task of creating a digital surrogate in the first place. A surrogate can’t take the place of the original if it can’t mimic it in the most basic ways. Given the fragility of Meiji and Taisho (and early Showa) sources, it’s crucial that we make available the highest-quality digital surrogates we can, and as soon as possible, before we no longer can.

*The first few editions of The Complete Works of Higuchi Ichiyo, which feature prominently in my dissertation, are a case of this. I never found a physical copy of the very first edition, actually, even outside of NDL.

digital resource: JAIRO

Today I’d like to introduce a digital resource that I’ve found phenomenally helpful in the past: Japan Institutional Repositories Online, or JAIRO.

This is exactly what it sounds like: a federated search for Japanese institutional repositories (IRs), with (of course) downloadable PDF full text of all the works that are in the database.* What’s amazing (to me) about JAIRO is that, unlike my stereotype of IR, it contains not only academic papers but theses and dissertations (which are also included in University of Michigan’s Deep Blue and many other American IRs), entire books, pieces of software, datasets, presentations, conference papers, and various types of bulletin and technical papers. Check it out:

The number of institutions involved in JAIRO is similarly mind-blowing. There’s no total listed on the page, but it’s well over a hundred, including universities ranging from Okinawa Christian Junior College to Waseda University. JAIRO also provides a separate full list of all IRs in Japan, 200 long, with links to each.

The content tends toward the scientific, but I’ve certainly found a large number of humanities resources. It’s great to have so many “departmental bulletin papers,” as they’re called, because the length and content of these is comparable to a “normal” journal article and they’re both current research and much, much easier to get in digital form. I’ve used several in my research already and have found them to be, hands down, the most valuable sources on the topics they cover.*

JAIRO has both a simple and advanced search, and it’s quite easy to use and browse through. Because it’s a site run by the National Institute of Informatics (NII) it also has some analysis of data about its own contents; additionally, that analysis is used to provide links to popular and new materials on the front page.

In comparison to the IRs I’ve used in the past, JAIRO’s interface is a miracle of both utility and usability (again, leave it to NII to create something this good): it’s powerful, easy to use, and quickly delivers you the content that you want. And it adds significant value by including even items as small as a list of frequently downloaded material or their (admittedly small) list of papers related to the Nobel Prize in Chemistry.

JAIRO is a project that falls under the umbrella of NII Institutional Repositories Program, which also includes the fascinating NII Institutional Repositories Database Contents Analysis with detailed statistics, graphs, and downloadable TSV files of data on IRs in Japan. JAIRO is also a search target of PORTA, the National Diet Library (NDL)’s digital archive search portal, which I’ve written about previously.

So my question to my readers is this: Is there anything like this resource for American or other English-language IRs? Anything like the PORTA digital archive federated search and portal? These are amazing resources and I only wish that I could search American universities’ IRs in the same powerful way.

* A caveat: I have no idea if it’s searching these multiple databases in real time or if it’s indexed and cached everything for search. (Reader question: does it still count as federated search if it’s not real-time?) Regardless, JAIRO retrieves results that would otherwise have to be accessed from over a hundred separate databases on their own individual sites.

** Two that come to mind are on the Meiji revival of Ihara Saikaku, and the posthumous reception of Kitamura Tōkoku.

programming practice problems

One of the hardest things for me about learning a new programming language is not getting an understanding of the syntax or overarching concepts (like object-oriented programming or recursion), but rather a lack of opportunity for practice. It’s one thing to read a few books about Python, and quite another to look at others’ nontrivial code, or write nontrivial code yourself.

However, I’m often at a loss for ideas when I try to come up with programming projects for myself. Call me uninspired, but I just don’t have many needs for writing programs in my daily life, especially complex ones. And I don’t have any big creative ideas, either. I don’t even have uncreative ideas. So what to do?

It turns out there are a few good resources online for practice programming problems. They’re language-agnostic, presenting a problem and asking you for its solution. Unfortunately, there are only a few resources for this, but I thought I’d share the ones I found.

The first is the Association for Computing Machinery’s International Collegiate Programming Contest. This provides the contest problems from 1974 to the present! Talk about a treasure trove of programming challenges.

Second, UVa Online Judge. This site contains hundreds of programming problems, some simple and some complex. They have volume upon volume of problem sets. You could spend the rest of your life doing the problems on this site.

Does anyone have additional resources to add?

free software day in Cambridge 9/15

Hi everyone,

It’s Free Software Day in Cambridge, MA, this Saturday (9/15) and there is a day-long event happening in celebration, and to bring the community together. If you’re interested in attending, it’s located at Cambridge College (1000 Mass Ave) and starts at 10 AM.