The eResearch conference was in Sydney Monday to Wednesday. I had a terrific time and came back with a lot of ideas.
The keynote note speech was from Dr Clifford Jacobs about Earthcube he stated that the knowledge environment can be done in quartiles.
Making knowledge visible
|
Building knowledge intensity
|
Building knowledge infrastructure
|
Building knowledge culture
· Value and culture
· Rewards
· Trust
· Sharing exchange
|
Challenges
· Collections and sufficient metadata
· Trust
· Usability
· Interoperability
· Diversity
· Security
· Education and training
· Data publication and access
· Commercial exploitation eg Google map
· New social paradigms – crowd sourcing
· Preservations and sustainability
· Stakeholder alignment
Must be able to create value by expanding the available pie
Mitigate harm
Ensure systems are stable and agile
A social network was established early and also collaboration encouraged for instance had seven different proposals for ontologies and semantic web groups, they asked to group together to try to minimise silo’s of information.
The parts not developed well by researchers were
Governance and standards
Discipline specific needs and drivers
Education and training
The practical things were well taken care of by the researchers.
The top 6 quoted barriers were
· No time
· No repository
· Files too big/ server too small
· Don’t want their research to be scooped by other researchers
· Uneven standards of metadata
Seven modes of success
· They are proactive
· Began with an end goal
· Prioritised tasks for instance governance first
· Emphasised non-competitive and broadly inclusive process
· Listened to the community
· Facilitated synergy within and across communities
· Engaged and energised colleagues
Failure reason
· Unrealistic and mis-aligned expectations
· An attitude of build it and they will come
· Not valuing what exists
· Not advancing the frontier in transformational ways
· Not engaging researchers
· Not anticipating needs of the next generational researchers
· Unknowns
Dell cloud product
Data always stored in Australia
For most people the country of storage is the no. 1 issue
Made the point that if you can move the data into the cloud easily also make sure you can move it out.
So the Dell solution is
Open
Integrated via Boomi
Security, the client have access to the security information
Can pay by
· Hour
· Month
· Or blade
Oracle
Digital preservation is a series of managed activities
Necessary to ensure continued access to digital materials for as long as necessary.
Issues of format of file type and storage medium.
Additional challenges
Humanities
Data not born digital
Digital data has large data protection problems for instance bit flip and bit rot
Multiple copies
Data integrity and validation
A facility must say if there are problems with the data.
Preservation standard OAIS
Software {Tesella
{Fedora commons
{DPSACE
{DURACloud
This software only puts data into a preservation format, they do not carry out data curation
Preservation is a service
Analogue -> Digital
Ingest and convert to preservation format –PDFa = PDF for archive
Automated verified tiered content infrastructure
JHOVE is an application which shows what file format is used.
So automated transfer from one technology to a new one.
Oracle T10000C provides data integrity by using CRC.
Tape has a shelf life of 30 years
TERN
To use Ausgoal implies copyright which implies some person had input however in the instance of a sensor feed no person is involved and therefore copyright does not exist.
Birds of a feather session Day two
Data management can be measured on two axis
· How much research funding did we gain?
· How many publications?
The question is how much better research did we do
Data management bis about a conversation so we have to be prepared to listen as well as talk.
Have to show the researchers something that means something
Show them something that they might be interested in
Need to build a collection of success stories that apply to the researcher and the level of research they are doing.
Therefore need to cheery pick stories to suit audience.
Communication is the most important thing.
Monash used a forum of bright ideas over breakfast (supplying breakfast)
So used researchers as a speaker and the speaker is also a champion of data management practices.
The venue makes a difference
There is a positive correlation between data sharing and a researchers H index
CAUDIT (Council Australian Universities Directors of Information technology) have been benchmarking on IT spending since 2003 in the first year there was resistance but then the directors started to share more and more. CAUDIT now produce a report for their members only and the members really like the information with now a 100% participation. The members are now ringing and asking for the report early.
Intersect – talked about the difficulty of measuring how successful their intervention is the research community. Talked about the importance of forming a community of practice of eResearch. They found Breakfasts worked well when informal with a short presentation. However they did need a short agenda.
Need to tell the story from their perspective so how is this going help with funding. NECTAR uses hypothetical measures of success such as use using Google analytics.
Choose case studies carefully as an early adopter is not necessarily the best case study.
The problem researchers have is
· Distributed data
· Messy data
· How to keep data secure.
QCIF – will have a cloud storage facility by late December
Data citation enables
Better researcher discoverability
Enablers acknowledgement and reward for researchers
The researchers H factor is enhanced and therefore the institutions reputation is enhanced.
Question
How will USQ get visibility for researchers and their data.
The value of data citation
User driven
Good data management practice
Need to reference their own data
A requirement for stable links to data is served through the provision of a DOI
Data collection is expensive
The emphasis is on collaboration not competition
Opportunities can be lost through lack of access to the data
Data is irreplaceable
DOI’s part of the solution
Performance metrics
Policy must be relevant
Data metrics (DOI’s, citation indices)
Data-cite has a global reach
CSIRO for data citation uses a system based automated process.
Researchers are interested in DOI’s and understand them.
They are beginning to see papers of digital citations coming through.
ANDS working with Scopus and Elesevier
Alt-metrics are able to give some metrics without the traditionally long wait of traditional publication.