Last visit was: It is currently Fri Apr 19, 2024 4:24 am


All times are UTC-05:00




Post new topic Reply to topic  [56 posts ] 
Author Message
 Post subject:
PostPosted:Wed Oct 24, 2007 7:46 pm 
 

Joined:Sun Jun 10, 2007 11:41 am
Posts:344
Location:Ninjaville
Multicore is not bad, however especially for something like gaming an out of order execution core could handle most situations as fast at half the development cost as six in order execution cores. However as a mainly PC gamer with a dual core Opteron I am happy because now everything will be optimized for dual core.


Top
Offline  
 Post subject:
PostPosted:Thu Oct 25, 2007 3:49 pm 
Connoisseur of the Godawful
User avatar
 

Joined:Tue Mar 01, 2005 9:00 am
Posts:456
ICQ:286315965
Website:http://rabidtinker.mine.nu/
Yahoo Messenger:alistair_lynn
AOL:Agent_Vast@mac.com
Location:127.0.0.1
OpenMP!

_________________
Alastair Lynn / Alumnus / Onlink Team


Top
Offline  
 Post subject:
PostPosted:Fri Oct 26, 2007 2:51 am 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Quote:
OpenMP!
The biggest issue isn't how to implement multithreading but how to properly serialize operations that cannot yet be done in parallel (for example, inserting into a tree without screwing up node linkage and balanced structure). Most programmers don't quite understand locks and get them wrong.

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Fri Oct 26, 2007 3:12 am 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Come on, folks, let's see some more potential solutions! We're here to help you learn. :)

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Sun Nov 11, 2007 4:55 am 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Wow, I can almost hear the crickets in here.

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Mon Nov 12, 2007 9:43 am 
User avatar
 

Joined:Thu Aug 18, 2005 8:19 pm
Posts:36
AOL:Fattierob
Location:f(x)
Hm. Well, I thought I would add in my random, quirky idea that I just had.

Basically, two threads, hopefully on a multi-core cpu.

One thread is sleeping and the other begins by going through the whole archive and takes out every entry beginning with the nth letter of the alphabet (let's say a for now). It then sends that array to the second thread (you'll have to excuse my wording - I only vaguely understand how threading works) and the second thread begins to quick sort the letter 'a' array. While that thread is quick sorting (hopefully just in main memory, otherwise the read/write's would clash with the first thread and reduce the speed of this idea) that array, the first thread goes through the (now smaller) archive and the process repeats.

When the second thread finishes sorting one letter 'n' array, it saves it to the disk (by either starting a new file or appending it to the end of the last file) it saved to, and then waits for the other thread to wake it up to begin again. If it, by any chance the second thread doesn't finish before the first thread sends it a new archive, then their should be a buffer of sorts, now that I think about it.

Pros:
Overall time is reduced as two instructions are done at once because of threading
If the letter 'n' array is done in main memory, the only kind of I/O needed is reading from the archive and writing down the second thread's quicksort

cons:
Would not work well on single core cpu's, or even worse
If the main memory is not big enough to hold 4gb (100gb archive / 26 letters of alphabet) on average of data, then their will be problems

tl;dr - two threads, one selects data, other quicksorts with a buffer. Probably not useable at all.


Top
Offline  
 Post subject:
PostPosted:Mon Nov 12, 2007 1:19 pm 
 

Joined:Mon Nov 12, 2007 11:48 am
Posts:1
ICQ:318966698
Location:Russian Federation
Fattierob's idea looks good, if all the data can be divided on equal groups, in his case - name starts from one letter.
But if all the records will start from letter "a" for example - algorithm sucks.
I think, that splitting is good.
So we need to break all data into "small" pices (that can be just stacked up together, after being sorted, to get our solution (*)), that can be fastly sorted in memory AND we can predict piece position.
First letter is the simpliest example - there is 26 groups, and we know that group "a" is first, "b" is next, etc.
But we can't hardcode breaking method, we need to do one pass to determine key ranges and break the data correctly to groups.
Proc of that solution is multithreading, all groups can be sorted by separate thread easily in one time, or one-by-one.
Con is exceeding pass, but it is not a big deal, i think. it just read the keys, not the full data. if the structures has equal size, like 1024 bytes, its easy.
The stuck point is there is no (or i just can't mention) good way to make-up a function, that can predict bounds of a group, to make them equal and save the (*) statement.
But i think that elegant way to calculate that borders exist.

Another "for-fun" solution in neural network, that will predict the position of a record in sorted set.
It need's waay much time to develop, even more time to learn, but the result can be awesome.
Position of a record can be predicted with high accuracy, and it needs almost no time at all.
Biggest con is sorted data will be ALMOST sorted (:
It still needs some small corrections, like switch some records in places, and even after that there is no 100% guarantee it will be totally sorted out.
But proc is SPEED! it is like O(n), just read-judge-write.

And sorry, English is not my native language.


Top
Offline  
 Post subject:
PostPosted:Mon Nov 12, 2007 9:58 pm 
User avatar
 

Joined:Thu Aug 18, 2005 8:19 pm
Posts:36
AOL:Fattierob
Location:f(x)
I've thought about it a bit more, after being awake for more then an hour.

Basically, the idea you're suggesting are hashing - where you break it up into all possible choices, make buffers for it, and then start sorting like that. (Note: I am completely and utterly talking out of a five minute skim of my textbook on that.) The problem with hashing is you need to have all the buffers ready for all choices, and you have to have them big enough so they don't overflow (or dynamic, which makes more overhead)

Now, what I was thinking was something like that, but revised abit to be threaded and a bit optimized

1st thread reads through the archive sequently. It checks the first letter of the array to n. If it is n, it sends that 1kb of data to the second thread. If it finds no more n's in the archive, it sends a signal to the second thread to write what it has sorted to disk. Then it increments n.

Now, the second thread wakes up when it gets this data, and does some things
1) Drops it into an empty array if it's the first n
2) If it's not the first n, it does a sort that I don't have the name for. Basically, you look at the middle data entry(let's call it the pivot) of the 'n' array and see if what you're trying to drop into it is before or after. If it's before, you take from zero to pivot/2 and repeat until you find your spot. Same idea for if it's after. from pivot to max, however.
3) If it receives a signal(method call, I guess), it writes the now-sorted data to the disk.

example:
1st thread sends ABC to the second thread. that thread's 'n' array contains "ABC". 1st thread sends "ACB". The 'n' array now contains "ABC", "ACB". 1st thread sends "AAB". The 'n' array realizes that 'AAB' comes before 'abc' and places it before it (since thats the only spot left). the array now contains 'AAB','ABC,"ACB". This continues until the 1st thread tells the second thread to write to disk, which it does so.


Pros:
Threaded and easy for dual core cpus
The second thread will not spend much time sorting, as it is already sorted when a new piece of data requests to be entered into it.

Cons:
if the program crashes, it needs to pick up at the last letter
Would not work well on single-core
Memory hog - 'n' array of 4gb average? however, you can't write to disk in chunks before you're done sorting. hm.

All in all, a good thought exercise. I should ask my CS teacher what she would do.


Top
Offline  
 Post subject:
PostPosted:Mon Nov 12, 2007 11:41 pm 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Okay, you guys are making a huge mistake by doing a division based on the 26 letters of the alphabet. Your assumption is that the data is evenly distributed, which is almost never the case, and if the data is heavily skewed (i.e. not evenly distributed) you could end up with a pathological case in which all the data ends up in one or two of the buckets, in which case you've made no progress.

You need to have a method that has no prior knowledge of the data. It's also best to not need to pass over the data more than once because of how expensive disk reads/writes are.

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Tue Nov 13, 2007 10:45 am 
User avatar
 

Joined:Thu Aug 18, 2005 8:19 pm
Posts:36
AOL:Fattierob
Location:f(x)
Hm. Well, if you don't want to pass over the data more the once, you read the data from the disk, and push it to a method that does the sorting I described earlier. Perhaps even having 26 threads for each letter? Of course, like you said Tycho, some threads would be worked more heavier then others, and this would be pretty bad.

Interesting, interesting. I can't wait to see what the optimum answer is.


Top
Offline  
 Post subject:
PostPosted:Thu Nov 15, 2007 5:47 pm 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Quote:
Hm. Well, if you don't want to pass over the data more the once, you read the data from the disk, and push it to a method that does the sorting I described earlier. Perhaps even having 26 threads for each letter? Of course, like you said Tycho, some threads would be worked more heavier then others, and this would be pretty bad.

Interesting, interesting. I can't wait to see what the optimum answer is.
With anything more than 3 or 4 threads, you start seeing diminishing returns from multithreading. There's a reason in hardware for why the speed increase isn't linear or exponential. I could explain, but I feel like I'd bore you.

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Thu Nov 15, 2007 6:29 pm 
 

Joined:Sun Jun 10, 2007 11:41 am
Posts:344
Location:Ninjaville
Fattierobs, approach would work well with a multi CPU server.


Top
Offline  
 Post subject:
PostPosted:Thu Nov 15, 2007 6:40 pm 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
Quote:
Fattierobs, approach would work well with a multi CPU server.
I guess you didn't read what I said. There's a point of diminishing returns to multithreading. I suppose I didn't appropriately state that this especially affects multi-CPU systems, especially when more than one thread is accessing a given memory page.

Just wait for transactional memory before you start considering multithreading seriously.

_________________
- Tycho

Image


Top
Offline  
 Post subject:
PostPosted:Thu Nov 15, 2007 6:56 pm 
 

Joined:Sun Jun 10, 2007 11:41 am
Posts:344
Location:Ninjaville
What about a program that has two algorithms, the first one sorts it by letter, and the second algorithm sorts within the subgroups. I don't know C++ that well, but here would be the psuedocode.

Organize data into subgroups by first letter of last name.
Have a parallel thread use qsort or some other algorithm on various subgroups that the first algorithm created.

I could be wrong, but I believe this would be faster than an approach of running qsort on one giant block of data, as it would only have to parse the first letter.

Forgive me if I'm making no sense.


Top
Offline  
 Post subject:
PostPosted:Thu Nov 15, 2007 7:00 pm 
Literally Nine
User avatar
 

Joined:Sat Apr 02, 2005 3:31 pm
Posts:1171
Location:The vicinity of an area adjacent to a location.
The only way for a multithreaded approach to work quickly would be to calculate the size of a memory page, and then be certain that your two datablocks from the file are on separate pages. If that's the case, then it can easily be a multithreaded task, and would theoretically be faster. The hardware limitation that would be most hampering is disk I/O.

_________________
- Tycho

Image


Top
Offline  
Display posts from previous: Sort by 
Post new topic Reply to topic

All times are UTC-05:00


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Theme created by Miah with assistance from hyprnova