Friday 29 May 2015

The UK Web Archive, Born Digital Sources and Rethinking the Future of Research

The following post is derived from a short talk I gave at a doctoral training event at the British Library in May 2015, focused on using the UK Web Archive.  It was written with PhD students in mind, but really forms a meditation on the opportunities created when we are working with web sites rather than print.  While lightly edited, the text retains the ticks and repetitions of public presentation.



My office c.1984
I normally work on properly dead people of the sort that do not really appear in the UK Web Archive – most of them eighteenth-century beggars and criminals.  And in many respects the object of study for people like me – interlocutors of the long dead -  has not changed that much in the last twenty years.  For most of us, the ‘object of study’ remains text.  Of course the ‘digital’ and the online has changed the nature of that text.  How we find things – the conundrums of search – shape the questions we ask.  And a series of new conundrums have been added to all the old ones – does, for instance, ‘big data’ and new forms of visualisation, imply a new ‘open eyed’ interrogation of data?  Are we being subtly encouraged to abandon older social science ‘models’, for something new?   And if we are, should these new approaches take the form of ‘scientific’ interrogation, looking for ‘natural’ patterns – following the lead of the Culturomics movement; or perhaps take the form of a re-engagement with the longue durée– in answer to the pleas of the History Manifesto.   Or perhaps we should be seeking a return to ‘close reading’ combined with a radical contextualisation - looking at the individual word, person, and thing – in its wider context, preserving focus across the spectrum.

And of course, the online and the digital also raises issues about history writing as a genre and form of publication.   Open access, linked data, open data, the 'crisis' of the monograph, and the opportunities of multi-modal forms of publication, all challenge us to think again about the kind of writing we do, as a  literary form.  Why not do your PhD as a graphic novel? Why not insist on publishing the research data with your literary over-lay?  Why not do something different?  Why not self-publish?

These are conundrums all – but conundrums largely of the ‘textual humanities’.  

Ironically, all these conundrums have not had much effect on the academy and the kind of scholarship the academy values.  The world of academic writing is largely, and boringly, the same as it was thirty years ago.  How we do it has changed, but what it looks like feels very familiar.

But the born digital is different.  Arguably, the sorts of things I do, history writing focused on the  properly dead, looks ‘conservative’ because it necessarily engages with the categories of knowing that dominated the nineteenth and twentieth centuries – these were centuries of text, organised into libraries of books, and commentated on by cadres of increasingly professional historians.  The born digital – and most importantly the UK web archive – is just different.  It sings to a different tune, and demands different questions – and if anywhere is going to change practise, it should be here. 

Somewhat to my frustration, I don’t work on the web as an ‘object of study’ –  and therefore feel uncertain about what it can answer and how its form is shaping the conversation; but I did want to suggest that the web itself and more particularly the UK Web Archive provides an opportunity to re-think what is possible, and to rethink what it is we are asking; how we might ask it, and to what purpose.

And I suppose the way I want to frame this is to suggest that the web itself brings on to a single screen, a series of forms of data that can be subject to lots of different forms of analysis.  A few years ago, when APIs were first being advocated as a component of web design, the comment that really struck me, was that the web itself is a form of API, and that by extension the Web Archive is subject to the same kind of ‘re-imagination’ and re-purposing that an API allows for a single site or source.  

As a result, you can – if you want – treat a web page as simple text – and apply all the tools of distant reading of text - that wonderful sense that millions of words can be consumed in a single gulp.   You can apply ‘topic modelling’, and Latent Semantic Analysis; or Word Frequency/Inverse Document Frequency measures.  Or, even more simply; you can count words, and look for outliers – stare hard at the word on the web!

But you can also go well beyond this.  In performance art, in geography and archaeology, in music and linguistics, new forms of reading are emerging with each passing year that seem to me to significantly challenge our sense of the ‘object of study’ – both traditional text and web page.  In part, this is simply a reflection of the fact that all our senses and measures are suddenly open to new forms of analysis and representation.  When everything is digital – when all forms of stuff come to us down a single pipeline -  everything can be read in a new way.  

 Consider for a moment the ‘LIVE’ project from the Royal Veterinary College in London, and their ‘haptic simulator’.  In this instance they have developed a full scale ‘haptic’ representation of a cow in labour, facing a difficult birth, which allows students to physically engage and experience the process of manipulating a calf in situ.  I haven’t had a chance to try this, but I am told that it is a mind altering experience.  It suggests that reading can be different; and should include the haptic - the feel and heft of a thing in your hand.  This is being coded for millions of objects through 3d scanning; but we do not yet have an effective way of incorporating that 3d text into how we read the past. 

 The same could be said of the aural - that weird world of sound on which we continually impose the order of language, music and meaning; but which is in fact a stream of sensations filtered through place and culture.  


Projects like the Virtual St Paul's Cross, which allows you to ‘hear’ John Donne’s sermons from the 1620s, from different vantage points around the yard, changes how we imagine them, and moves from ‘text’ to something much more complex and powerful.  And begins to navigate that normally unbridgeable space between text and the material world.  And if you think about this in relation to music and speech online – you end up with something different on a massive scale.

One of my current projects is to create a sound scape of the courtroom at the Old Bailey - to re-create the aural experience of the defendant - what it felt like to speak to power, and what it felt like to have power spoken at you from the bench. And in turn, to use that knowledge to assess who was more effective in their dealings with the court, and whether, having a bit of shirt to you, for instance, effected your experience of transportation or imprisonment.  And the point of the project is to simply add a few more variables to the ones we can securely derive from text.

It is an attempt to add just a couple of more columns to a spreadsheet of almost infinite categories of knowing.  And you could keep going – weather, sunlight, temperature, the presence of the smells and reeks of other bodies.  Ever more layers to the sense of place.  In part, this is what the gaming industries have been doing from the beginning, but it also becomes possible to turn that creativity on its head, and make it serve a different purpose.

In the work of people such as Ian Gregory, we can see the beginnings of new ways of reading both the landscape, and the textual leavings of dead.  Bob Shoemaker, Matthew Davies and I (with a lot of other people) tried to do something similar with Old Bailey material, and the geography of London in the Locating London’s Past project.

This map is simply colours blue, red and yellow mapped against brown and green.  I have absolutely no idea what this mapping actually means, but it did force me to think differently about the feel and experience of the city.  And I want to be able to do the same for all the text captured in the UK domain name. 

All of which is to state the obvious.  There are lots of new readings that change how we connect with historical evidence – whether that is text, or something more interesting.    In creating new digital forms of inherited culture - the stuff of the dead - we naturally innovate, and naturally enough, discover ever changing readings.  But the Web Archive, challenges us to do a lot more; and to begin to unpick what you might start pulling together from this near infinite archive. 

In other words, the tools of text are there, and arguably moving in the right direction, but there are several more dimensions we can exploit when the object of study is itself an encoding.

Each web page, for instance, embodies a dozen different forms.  Text is obvious, but it is important to remember that each component of the text – each word and letter, on a web page - is itself a complex composite.  What happens when you divide text by font or font size; weight, colour, kerning, formatting etc.  By location - in the header, or the body, or wherever the CSS sends it; or more subtly by where it appears to a users’ eye - in the middle of a line – or at the end.

Suddenly, to all the forms of analysis we have associated with ‘distant reading’ there are five or six further columns in the spread sheet – five or six new variables to investigate in that ‘big data’ eye-opened sort of way.

And that is just the text.  The page itself is both a single image, and a collection of them – each with their own properties.  And one of the great things that is coming out of image research is that we can begin to automate the process of analysing those screens as ‘images’.  Colour, layout, face recognition etc.  Each page, is suddenly ten images in one – all available as a new variable; a new column in the spreadsheet of analysis.  And, of course, the same could be said of embedded audio and video.

And all of that is before we even look under the bonnet.  The code, the links, the meta data for each page – in part we can think of these as just another iteration of the text; but more imaginatively, we can think about it as more variables in the mix.

But, of course, that in itself miss-understands the web and the Web Archive.  The commonplace metaphor I have been using up till now is of a ‘page’ – and is the intellectual equivalent of skeumorphism - relying on material world metaphors to understand the online.

But these aren’t pages at all, they are collections of code and data that generate in to an experience in real time.  They do not exist until they are used - if a website in the forest is never accessed, it does not exists.  The web archive therefore is not an archive of ‘objects’ in the traditional sense, but a snapshot from a moving film of possibilities.  At its most abstract, what the UK Web Archive has done, is spirit in to being the very object it seeks to capture – and of course, we all know that in doing so, the capturing itself changes the object.  Schrödinger's cat may be alive or dead, but its box is definitely open, and we have visited our observations upon its content.

So to add to all the layers of stuff that can fill your spreadsheet, there also needs to be columns for time and use; re-use and republication.  And all this is before we seek to change the metaphor and talk about networks of connections, instead of pages on a website.

Where I end up is seriously jealous of the possibilities; and seriously wondering what the ‘object of study’ might be.  In the nature of an archives, the UK Web Archive imagines itself as an ‘object of study’; created in the service of an imaginary scholar.  The question it raises is how do we turn something we really can’t understand, cannot really capture as an object of study, to serious purpose?  How do we think at one and the same time of the web as alive and dead, as code, text, and image – all in dynamic conversation one with the other.  And even if we can hold all that at once, what is it are we asking?

50 comments:

jhonalex said...

Wonderful illustrated information. I thank you about that. No doubt it will be very useful for my future projects. Would like to see some other posts on the same subject! websepanta

Unknown said...

UK Essay Writing

Unknown said...

Wiztech Automation is a Chennai based one-stop Training Centre/Institute for the Students Looking for Practically Oriented Training in Industrial Automation PLC, SCADA, DCS, HMI, VFD,VLSI, Embedded, and others – IT Software, Web Designing and SEO.

PLC Training in Chennai
Embedded Training in Chennai
VLSI Training in Chennai
DCS Training in Chennai
IT Training Institutes in Chennai
Web Designing Training in Chennai

Bruno Maul said...

Those are considered to be quite expedient and we would almost be able to produce everything which is going to be helped in our recent activities and also for the future. academic posters examples

Review Website Jasa SEO said...

jasa lukis dinding
jasa lukis mural

https://www.turnkeylinux.org/user/67281
http://sugarlandmusic.com/users/indonesiamural
http://cyndilauper.com/users/indonesiamural
http://justintimberlake.com/users/jasalukisdinding
http://boinc.umiacs.umd.edu/team_display.php?teamid=40948
http://www.ras.unam.mx/continual/team_display.php?teamid=4934
http://pusbangkol.pnri.go.id/forum_pengolahan/hasil-pencarian.html
https://audioboom.com/indonesiamural
http://truxgo.net/blogs/8591/448/lukis-dinding-mural-pada-coffeeshop-di-jakarta
http://truxgo.net/blogs/8591/447/jasa-lukis-mural-dinding-buat-cafe-kantor-sekolah

Anonymous said...



Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to mind.


Salesforce Training in Chennai

tanya sweet said...

First of all i am saying that i like your post very much.I am really impressed by the way in which you presented the content and also the structure of the post. Hope you can gave us more posts like this and i really appreciate your hardwork.

Chocolate Day 2017
valentine week list 2019
status for whatsapp in hindi
Good Night Wallpapers
Birthday Wishes for Brother
Happy New Year 2017 Clipart
Valentine Week List 2017

Unknown said...

merry christmas greetings images
christmas cards photo
vintage christmas background images 2016

images of christmas greetings
photo christmas cards 2016
images of christmas decorations 2016

Unknown said...

new year wishes for husband
new year images and quotes
new years eve party
merry christmas and happy new year
new year cards animated funny

Shubham said...

valentine's day gift baskets
printable valentines day cards
valentines day cards meme
valentines day crafts for kids
romantic valentines day ideas for her

Unknown said...

valentines day cards funny

valentines weekend deals

disney happy valentines day clip art

valentine day song

happy valentines day mom images

Unknown said...

easter 2017 coloring pages to print

Unknown said...

simple easter wishes
happy easter greeting words
happy easter screen wallpaper
happy easter bunnies clip art

Unknown said...

cute easter quotes
easter wishes quotes
easter bunny images clip art
easter egg coloring pages

Think High said...

Happy easter bunny free images 2017
Happy easter bunny images clip art 2017
Happy short easter poems about jesus 2017
Happy beautiful easter quotes 2017

Unknown said...

when is easter

shashank pandey said...

Happy 4th of July Images

4th of july picture messages

4th of July Quotes and Sayings

4th Of July Quotes

Fourth 4th of July Quotes Sayings

Assignment Help said...

Thanks a lot for the post. It has helped me get some nice ideas. I hope I will see some really good result soon.
Do my Assignment

Unknown said...

Awesome Post!
Check out Diwali/Deepavali Poojs/Puja Vidhi in Hindi
Also download latest movie A GENTLEMAN 2017 PRE-DVDSCR HD FULL MOVIE DOWNLOAD
Also Read trending Story Comic Strips
Also Check out Latest vacancy of Government Job SSCWR Recruitment 2017

bestreviewsoftrailscameras said...

view page
guide
best guide
new

Unknown said...



Mobile Chillers for Sale In South Africa | Best Freezers Manufacturer



Mobile Chillers Gallery



View all our Product gallery from the factory.



Contact Us - Mobile Chillers for sale for a free quote



Contact



Contact Us - Mobile Chillers for sale for a free quote, Mobile Freezer for sale,Mobile Coldrooms for sale. Tel: 031 826 7662.



Mobile - Mobile Chillers for sale - Mobile Chillers For Africa



Mobile Chillers For Sale






Mobile Chillers For Sale SA's Biggest Mobile Fridges Manufacturer



Mobile Chillers For Sale in Durban



Buy affordable mobile chillers for sale in Durban. Our Main Plant is in Durban KwaZulu Natal, however, if you are looking for mobile chillers for sale Johannesb.



Mobile chillers for sale in Johannesburg SA's Top Manufacturers



Mobile Chillers For Sle In Johannesburg



Mobile chiller for Africa supply direct from the manufacturer. We are the leading and quality suppliers of mobile chillers in Africa. Chillers were made with only the ...



Gallery - Mobile Chillers for sale - Mobile Chillers For Africa



Gallery



May 25, 2018 - Leave a Reply Cancel comment reply. Your email address will not be published. Required fields are marked *. Name *. Email *. Website.



Mobile chillers for sale in Pretoria,mobile freezers by manufacturers



Mobile Chillers For Sale In Pretoria



Mobile chillers for sale in Pretoria by Mobile Chillers for Africa supply direct from the manufacture. We are the leading and quality suppliers of mobile chillers in ...

Towelroot said...

http://gamehackerapk.info/ Looking for best game hacking apps for Android 2019 with no root access required? check out these android game hackers without or no root access required.

ppsspppsp said...

ppsspp ios 11 no jailbreak

PPSSPP Gold APK:- Emulation is a very excellent strategy in the tech world to battle against any obsolescence, emulating apps and games have become less

bestgolfsimulator said...

bsnl speed test
It ensure comfort Bsnl speedtest.

Unknown said...

PSL 5 is aiming to change the history of Pakistan and revive its ability to host international events. So, each and every moment of the PSL live streaming 2020 to be groundbreaking.
As officials are investing millions in making this event possible, so there is no room for mistakes.

Haider Jamal Abbasi (iAMHJA) said...

Girl Instagram Captions
Spotify Premium Apk
Whatsapp Plus Apk

Florahmelda said...

English literature assignment writing services are essential for English literature coursework writing service students and English Literature Writing Services seekers.

Best Hindi & English Shayari said...

This is a Very very nice article. Everyone should read. Thanks for sharing. Don't miss WORLD'S BEST Shayari

Shayari
hindi shayari
urdu shayari
attitude shayari
dosti shayari
sad shayari
romantic shayari
Funny Shayari
dard bhari shayari
love shayari
eyes shayari
good morning shayari
good night Shayari
Shayari4All

karanroy said...

आज आप जानेंगे Vitamin B12 Sources in Hindi के बारे में. इसके आलावा आपको बेस्ट Vitamin b12 sources for vegetarians in Hindi से भी रूभरू कराया जाएगा.
Vitamin B12 Sources in Hindi.

salome said...

thank you for your intersting article

devops Training in chennai | devops Course in Chennai

salome said...

thank you for your interesting article

devops Training in chennai | devops Course in Chennai

Yakshita said...

Your blog is filled with unique good articles! I was impressed how well you express your thoughts.

Umang App | Umang App for PC | Umang App Download

IVR Call Center Solutions said...

ConVox Omni-Channel Solution-Deepija Telecom
Omni--channel contact center centralizes multiple communication platforms
ConVox Omnichannel Solution enables communication across all popular channels like Voice,
Chat, Video, Social Media & Email from a single user interface.Omni-channel is Suitable for Incoming & Outgoing Process, IVR Enabled Contact Centers,
Customer Service Agency, BPO,KPO, Debt Collection, Insurance Marketing, Customer Surveys and Loyalty Programs.
omni-channel, omni-channel contact center,
omni-channel communication platform, omni-channel solution, centralized solution
omni-channel
ivrs
ip-pbx
Call Center Software
Call Center Solution
Call Center Solutions

Yakshita said...

Thank you for posting such a great article! It contains wonderful and helpful posts. Keep up the good work

Samagra Id

Kaylee Brown said...

Thanks for this information, it will help students who need resources for research papers but have no knowledge about methodology. Also, they can check more info about research methodology.

TAZEEN said...

Technologistan is the popoular and most trustworthy resource for technology, telecom, business and auto news in Pakistan
samsung galaxy A32 price in pakistan

TECH SUPPORT said...

Wow! This can be one particular of the most useful blogs We’ve ever arrive across on this subject. Basically Excellent. Thank you for the useful article. Also, check our site.
https://printer-scan.co/canon-scanner-error-4630
https://printer-scan.co/canon-tr8520-scan-to-email
https://printer-scan.co/canon-printer-wont-scan-in-windows-10
https://printer-scan.co/canon-ij-scan-utility-for-windows
https://printer-scan.co/canon-mg7720-scan-to-mac

TAZEEN said...

Technologistan is the popoular and most trustworthy resource for technology, telecom, business and auto news in Pakistan
Iesco duplicate bill online

Unknown said...

With the broad market reach of online businesses, Dogecoin can be utilized to purchase goods from anywhere in the globe using services that provide worldwide shipping.capital One Login
american express login
ethereum wallet
gemini exchange
blockchain wallet

Jennifer Mofi said...

Honda Garmin update
Imagine you are leaving for a trip of driving for something very important. Don’t you think the latest map and traffic updates make work easy and get the destination fast? Yes, you are right that the latest map updates and traffic updates are equally important for everyone. Here is how it works our detailed street maps make sure that you never your exact destination. Make sure you have the latest updates on your device to ensure fast and accurate navigation. To get any Garmin map updates, whether you are looking for map updates, software updates, or traffic updates, make sure to have Garmin Express installed on your computer. Here we guide you on how to get the latest map updates in your honda Garmin.

Ankur Kumar said...


A person's personality is greatly influenced by their way of living. Because of this, living a healthy lifestyle will inevitably produce healthy results at work or at home.
By making the internet accessible entirely unbiased, scientifically sound, and easily understandable health and lifestyle-relevant information, Healthy life human is redefining how the general public views health and fitness. We set out to build a website that provides trustworthy, reliable, and current information on a variety of subjects, including health, fitness, nutrition, well-being, and food.

وليد العروي said...

صيانة افران بالمدينة المنورة
فني اصلاح افران بالمدينة المنورة

Anonymous said...

We appreciate you writing here about this great topic and also sharing your thoughts on this subject with us.
Buy Twitter followers India

kosmiktechnologies said...

Iam very pleased to read your article

python full stack trainingin hyderabad

Easyprinterhelp said...

https://bit.ly/3HhV84M
https://bit.ly/3Vauvoh
https://bit.ly/3Hm5xwf
https://bit.ly/41CCXip
https://bit.ly/3LDH1JR
https://bit.ly/3VfCP5T
https://bit.ly/3LgVJ7R
https://bit.ly/3LC3Ld1
https://bit.ly/3njvXb8
https://bit.ly/3LcTClO

Techworld said...

Thanks for sharing this amazing article. az 300 training

Şule KAYA said...

Nice article, thanks for the information. If you are interested in medical treatment in turkey, I definitely recommend looking limb lengthening surgery in turkey.

Digital Arnav said...

Embark on a transformative journey into the realm of data with the Data Science Course in Noida at APTRON Solutions in Noida. In an era where data is king, mastering the art and science of data analysis is essential for professionals seeking to stay ahead in their careers. Discover the unparalleled opportunities that await you as you delve into the intricacies of data science in the vibrant city of Noida.

Kageyamasenpai said...

https://hub.ricssbe.org/a-comprehensive-guide-to-top-mba-courses

https://medium.com/@aakashyadav1810/a-deep-dive-into-mba-in-construction-management-in-india-d65d2657d344

Yogita Fun said...

I will bookmark this blog of yours, its so good man. Waiting for your next updates
Indore College Models