Individual own data

January 14, 2011 by

I picked up last year that Tantek was using his personal website to author all his data from.  He is a champion of ‘fighting for the user’ but my initial take on this was that I need an ultra smart authoring tool to cater for all those publishing activities I can perform today and more and more tomorrow.  However, use third party tools then the data is then ‘owned’ by that service.  However, for some you will need to use local tools to get the data to the service e.g. hardware, camera and local uploading tool to get it to e.g. flickr.com   so I will have a local copy (my pc, Sony, local software microsoft and online yahoo flickr) so I need lots of local tools to get to the online service that now ‘owns’ my data!  This is a fascinating tension and a couple of blog post give th arguments, Decentralized tools and self aggregate, Zeldman and Tanteks, self publish and copy everywhere.

A conference is coming up in June this year, the IndieWeb unconference in Portland, USA for those fighting for individuals to own their data.

content authoring + silo data still holding sway

January 12, 2011 by

2011 and Quora.com seems to be next site to seep into the main stream of early adopters online. I describe these sort of sites as prompted content authoring.  What is unprompted content authoring?  That would be publishing a blog post.  A twitter update is also authoring (note those with a monopoly -ish athoring platform then have an exclusive over doing stuff with that data or via an API allow others to help them do so).  Authoring seems pretty straight forward to describe but prompted/unprompted seems harder to pin down as every bit of information is authored in a conversation? Right?

I want to put forward an observation to explain unprompted content authoring v prompted. Unprompted content is content authored based on an individuals free will, maybe in response to a conversation, an idea, to take issue with another post. The point is they are free to author anyway they like and additionally they are free to host and publish the content where they please.  The down side is that content will be dis- intermediated from all the other content in the information universe.  Okay, comment, track backs aid connection but are reliant on direct linking from those who know the content has been published. But who knows everything?  Prompted content authoring is constraining the context and gathers demand around that context to increase the probability of answers and of course answer are just more prompted content authoring.  For me the prompted – unprompted divide is based not on the act of authoring content per say but in setting the community environment so that more content is authored in context around aggregate demand.  The quoras are bringing efficiency to this by centralising the demand, much as Groupon does for individuals to aggregate demand for local suppliers.  I think time will show this need to centralise will be a transient period in the evolution of the web as all the ‘answers’ will be authored and all individuals will be connect in context regardless of where they author there content online (ie no one website setup for each purpose).  This decentralized demand authoring is better described in this unprompted post by EmergentbyDesign blog.

open source search

October 1, 2010 by

There is an interesting post on open source search / filtering software on O’Reilly blog today.  mepath is powered by open source software from the language – PHP to server – Lunix to database – mysql to server – apache and the actual website is produced using the lifestylelinking open source project.  The game now with all the data is to find the ‘Signal’ in Linkedin words.  That is filtering the content based upon the structured data they hold, professional profiles to company data.  Bring in a twitter stream and the firehose (too much) data.  Can it be filtered to find the best stuff?  This is all great and a step forward but the game changer is when this data gets used to actually make the world go around?  Right now its being used to filter content based upon social capital or professional capital, and while humans have their hands on the levers of business right now this is enough for progress.  But why not get this data working at source to contribute to doing what ever its purpose is?

Intent web (attention economy) ready to go

September 22, 2010 by

Matthew Kumin blog post entitled The Web of Intent is Coming (sooner than you think) gives an excellent introduction on how the next generation of tools for individuals will bring the web of Intent to the fore.  He lists 1 to 5 the key ingredients of the changes, its interesting to note the use of the word ‘integrated’, it appears in connection with the ‘integration of search and publishing’.  In fact 4 of the 5 bullet points talk about Publishing, only bullet point 5 focuses solely on the individual and it is bullet point no. 5 that fits in with the goals of the lifestylelinking project.

It is surprising to see the publishing side of the question so prominent.  I think the reason for this presentation of Intent by Matthew is because he knows the publishers (any author of content)  wants to contribute and participate in the right context online, thus the integration of the search side gives them this view.  However, if each individual has their own no. 5 ‘curated feeds’ then they would have all the information they require to author their contributions.  The other reason I can think of is that there is a school of thought that the current authoring CMS (content management tools) could do better at expressing the intent of the publisher, wrapping the text in tags or an ontology for the linked data community etc.   The observation to provide here is that the text, ie the words the author expresses is the most important articulation in the whole publishing process and subsequent ‘markup’ or ‘tagging’ is secondary to the communication expressed by the publisher.  This should not be forgotten, the primary intent of the author is what they author not what they wrap it in.

all information is relative?

September 6, 2010 by

The website PeerIndex.net have talked about their ‘relative measure of an individual’s online authority’ as the Semantic Web Blog puts it for them.  PeeerIndex are similar to Klout.com and others looking to establish an online influence measurement score online.  The idea of ‘relative’ is interesting, it is what we use on mepath.com and the lifestylelinking – open source project.   As far as I can understand the relative authority index combines three parts, authority, audience and activity.  I class this data as log data, that is data produce in serving up and interacting with a webpage.  It is valuable data as googles pagerank demonstrates for every search query.  What I have found out over the years of exploring data from such sources, it is the second best proxy for the real context or meaning authored in the text.  That is why, mepath and the lifestylelinking project use relativity based on the context of each blog post authored.  Put another way it uses nil log data in its calculation of relative.

Questions & Answers linking

August 18, 2010 by

The question and answer format of information has been ever present on the web, from the first FAQ to the more sophisticated community approaches of e.g. GetSatisfaction to Hunch.   An now the NLP – semantic web – machine automation startups are entering the fray, Swingly has opened its beta doors.

I’ve not used the service yet but this article provides a good insight into the thinking behind the service.  The state they are a ‘web scale answer engine’ or a ‘micro local’ search engine.  But the key is the automation provided by NLP to classify and connect factual answers to questions.  I am never sure what factual answers are, WolframAlpha use the term too, things like the laws of physics that resolve to precise meaning or number.  In the world of life, things are a bit less definitive I always think.  The Swingly service talks about addressing trustworthiness and getting the most up to date data to build a social graph (much like Klout.com does for influence)  but I think in the end,  transparency over the linking logic builds trust.  I foresee much more innovation in this area.

Data Portability Policy

June 23, 2010 by

The Data Portability Project  released the Portability Policy  framework today.  The idea being that like a websites terms of use and privacy policy, the user has to accept them to allow use, a portability policy would set out the terms in which their data can be move around the web.  This is a welcome initiative but I prefer to see the VRM (vendor relationship management) thinking taken forward too, an individual centric portability policy that the website owner would have to accept.

open sourcing – mepath

June 21, 2010 by

The code that is used to operate mepath.com was open sourced last week.  It is called the lifestylelinking – open source project.

data decentralization

May 12, 2010 by

The gravity of Facebook has started to pull on all data on the web.  The opportunity of giving individuals the power to expand their social network to all places on the web is a good thing.  The question is: should one business be the monopoly provider of this utility?  Some young developers think their is a need for an alternative, a decentralized alternative, they have crowd sourced funding and are called Diaspora Project.

The team at RedWriteWeb have a great overview on them.  The most interesting thing they write about Diaspora is that they are using a wordpress like development stance, opensource on the one hand and a managed hosted service on the other, I’d call that a cloud service.  They point to no centralization but I have found that the wordpress.com service does exactly this.  They restrict the plugins and other functionality for security reasons.  I’d like to see us making this hosted or cloud service have the same standalone freedom as a self installed server has.  What technical implications that involves are not fully known but if the goal is to have data portability and interoperability of data at the individuals control then, we should aim for those same standards on the applications that enable that empowerment.

Semantic web summary video

May 11, 2010 by

This video provides a good introduction to the status, debates and tech behind the semantic web.