Posted By: longzheng | Jul 10th @ 1:06 AM
page 1 of 2
Comments: 46 | Views: 845

So I've been tweeting about a new idea I had and finally got around to writing about it in great detail. I've since posted it on my blog for public review. Obviously there's a lot of smart (and smarter than myself) fellows here so I'd really appreciate it if any of you could help me by taking a look.

http://www.istartedsomething.com/20090710/reimagining-file-distribution-universal-downloads/

 

While I'm here, would also like to thank Sven Groot who helped me validate and redefine the idea. Smiley

Part of your idea solves an issue that has been bugging me for a long time. I have a generic "Downloads" folder, as I'm sure many others do. It's basically the staging area that everything gets dumped into before I decide where it should go. Some stuff stays in there indefinitely.

I'll end up with a lot of random installers, EXE's, documents, images, etc. Like right now, I have an anonymous setup.exe file. The file properties tell me nothing. It has a digital signature that says it's from Microsoft, but that's not helpful and not everybody signs their stuff. The only solution at this point is to run it. UAC is able to tell me that it's VSTS 2008 Database Edition GDR. However, sometimes installers take a while to get started to the point where you can see what they are and kill them.

I've been plagued by the same sort of scenario more times than I can remember. So, your idea presents a very cool solution. I can see an explorer extension that adds a "WTF is this?" context menu item. Or something that can store descriptive metadata locally.

I think a wiki/moderation model is a good start, but could run into problems. There are numerous threads here on C9 that hash (Tongue Out) over various methods for ensuring the quality of the forums, reducing spam, mitigating trolls and knowing who to trust. I think some of those discussions might be obliquely helpful in figuring out how to make sure that the contributions were of high quality. You're going to have to dig them up yourself, it's late and I'm lazy.

mikexkearney
mikexkearney
Only when trust is earnt can sex and free discounts be exchanged.

Yes, this bugs me too. I have a habit of saving software in the 'Downloads' folder; then eventually transferring it to my external hard drive when my laptop is near full. Moving these files is a chore in itself as I cant remember what these files are. Their names are never consistent nor informative.

I downloaded the Virtual PC 2007 yesterday and both the 32 and 64 bit version both have an installer called 'setup.exe', duh!

On another subject I'd like better 'regulations' enforced over the consistency of meta-data somehow(no idea). Anybody own a Zune and have multiple music download sources know what I mean?  

 

 

 

 

Sven Groot
Sven Groot
My name has 9 letters. Coincidence? I think not...

Why is that a problem? I just change the name of the file to something more descriptive before downloading.

mikexkearney
mikexkearney
Only when trust is earnt can sex and free discounts be exchanged.

More of an irritant, something that should have been established at the source.

Dodo
Dodo
I'm your creativity creator™ :)

File hash, File size and type of hashing algorithm should suffice to identify a file. Though, at sizes below 1024 bytes, generating files that match a certain criteria becomes very easy. Furthermore, since file sharing systems already do this kind of thing, they never really took off because of their inconvenient usage. Just for illegal file sharing they end up as a 'last resort' because the decentralized file distribution cannot be easily blocked. Legal files are very rare in file sharing networks, because direct HTTP downloads are faster and more convenient in most cases.

blowdart
blowdart
Peek-a-boo

SHA1? Really? A hashing algorithm which was broken in 2005?

My bad. Somewhere I got links crossed.

I just download them on desktop and rename them before/after. At most, I think the downloader just add URL as a new tag or append to description. (And still maintain the datetimes.)

 

May28th2018
May28th2018
May 28th, 2018

OMG, somebody finally thought of RPM gpg keys and yum/apt-get

http://www.google.com/search?q=rpm+gpg+key

 

.... 12 years ago

 

sorry buddy.

I think you've got it backwards... my reading of RPM/GPG keys with yum would seem to indicate that they are used to verify that a package is correct... but not to specifically identify or find an unknown file given a known hash (ie the reverse of what you are pointing to).

SlackmasterK
SlackmasterK
I write my OWN blogging engines

I'd like to see a file system that lets me have multiple versions of the same filename in the same place, kind of like the GAC does but more NTFS-ey.

So this is like Authenticode (signing files with certificates) but extended to all files, not just code artifacts.

Or like DNS but for files instead of IP-enabled endpoints.

The problem with setting up a universal hash database is that you can't keep it up to date without talking to each other so you have to have some sort of mechanism (e.g., certificate authorities) to verify your hash lookups.

A karma-based system might work in a (slightly less than) ideal world, but I don't think it would fly in the real world -- indemnity matters more to businesses than karma.

blowdart
blowdart
Peek-a-boo

Oh and the best bit? As soon as you start calculating hashes it's doubtful you can be a common carrier as you're inspecting files. So suddenly you're liable for lots and lots of copyright infringement lawsuits

I don't get digital signature. Most of crapware are using unknown digital signatures anyway. Yeah, they are signed, but that tells me nothing about the integrity of creator like TrojanEdu.

exoteric
exoteric
I : Next<I>

The torrent is aquired via HTTP - maybe, but surely it consists of hash-identified files or file segments.

See also http://bitpedia.org/  &  http://bitzi.com/developer/xml

There is much metadata that one could attach directly to a URI - one might invent ones own URN type or separate URI scheme. But still, it is probably best to keep things distinct and as simple as possible. The existence of the hash should be sufficient - after that, anyone can say anything about that hash and trust needs to be established.

SHA1 may not be the best choice as mentioned but that's more or less a detail. There is SHA256 and SHA512 and Whirlpool, etc.

I find the larger problem of sustainable storage to be more interesting. A system that identifies and protects endangered data transparently. It may end up having historical significance or only individual significance.

I wonder if in the distant future, we'll not just be using one large "drive" that transparently ensures that files are secured in time and space.

As I understand the proposal, the hashes are calculated by whoever creates one of the special links to a copy of the data. Then when someone clicks on the link it talks to a web service to see if anything else is hosting the same data/hash. (I think it will then add the url/hash to the database if it wasn't in there already, but maybe that's the job of whoever created the link.)

The data itself isn't downloaded by the web service, and the web service doesn't calculate/verify the hash.

stevo_
stevo_
Human after all

I don't think this is a reimagination really..

ManipUni
ManipUni
Proving QQ for 5 years!

I was thinking the same thing.

Oh and having a huge database containing all the File Hashes would be abused into the ground. Police, Government, ISPs, Private Companies, Customs, etc.

May28th2018
May28th2018
May 28th, 2018

Yum and apt-get take a human readable name as an argument

IE, I want firefox

yum install firefox

Or you can do it via the yum or apt gui. There is no need for any further complexity.

Sergey Brin had a similar idea using hashes for copy protection on the web called COPS.

http://infolab.stanford.edu/~sergey/copy.html

document chunk hash psuedo code here.

Sergey's idea resembles bittorrent more than it does longzheng's file dist system though as bittorrent tracks files by dividing it up into pieces and assigning hash integrity keys to them.

At any rate it never materialized. I don't see this hash based software repository system for Windows materializing either. The only companies capable of implementing it successfuly would be download.com, wise, installshield, ect... Otherwise it would have no traction or interest.

littleguru
littleguru
<3 Seattle

that's also what came immediately into my mind when I read the post Big Smile

littleguru
littleguru
<3 Seattle

The idea is nice. Smiley But there are a still some concerns that I have and that need to be thought out before this could roll. One of them is: how is the authority that makes sure that a certain hash really represents a certain file? if that is done by the community, you need to allow to attach multiple files with the same hash. otherwise someone could "block" a hash. now it might happen that I get multiple results for one hash.

page 1 of 2
Comments: 46 | Views: 845
Microsoft Communities