Feel free to leave feedback!
Search
Tuesday, June 18, 2013
Colorado River - Ruby / Horsethief Canyons - Floating!
Location:
Grand Junction, USA
Again, we got a group together, loaded up the boats, and hit the Colorado River to raft through Ruby and Horsethief canyons. We went for twice as long as our previous trip, and I took some video over two of the float days. Below is the video. If you are looking for more textual information about this trip, see my previous post on the area!
Monday, June 17, 2013
What are switch statements under the hood?
Location:
Salt Lake City, UT, USA
If you don't know what a switch/case block is (often called a switch statement, or switch condition), then this post will likely not be of any use to you. I apologize, and promise that this will be the last very technical post I make for a great while. If you're a programmer and don't know what I am talking about, break out google and learn about switches, then come back so you know how they work and can properly take advantage of this wonderful tool!
Note: ALL CAPS represent enums or constants, numbers, 0-5.
A switch form of the above code might look like this:
So I hope that by looking at my previous example of pseudo assembly, you can see the obvious way that this may be implemented in machine code. A few comparisons for equality and jumps to different addresses and you're done. But since I am writing about it, there must be a better way. Let me propose that instead of doing comparisons, we do some calculating. We have 6 total choices, and the range is 0 to 5, this is important. Note that there are no gaps, though this is not a necessity. Also recognize that once we have made a choice, we have our 6 possible outcomes and that each of these in machine code is two operations: one to set a memory location or register value, and another to jump past our choices (the equivalent of break), twelve total lines of code. We now can write something with this logic (the chosenClass is held in A, hitPoints is held in B):
So how is this working? First we look at how large the commands that we are jumping to are, in this case they are each 2 instructions, Set B to a value, and jump beyond the switch (to instruction address 14). To account for this we multiply A by 2. We then look at where these commands start, they start at address 03, so we add 2 to A (2 because we are using 0 as a number), and this defines where we jump to in memory. If the switch input is 3, we multiply by 2, add 2, and we find that we should be jumping to address 08, set B to 30, which properly corresponds to our ROGUE, which is correct for an input of 3.
But this is only a simplified version of a jump table, contrived to show just the basic logic and illustrate how jump tables avoid complexity and as a result can substantially speed up your code. Our logic here is great to illustrate those points, but it is terribly limited in application. What if the code that we executed on a matched condition was not uniform in length? Suddenly our trick of multiplying by 2 fails! What if the numbers don't start at 0, or 1, what if they start at 1430 and go to 1634, a range of 204? Our code again breaks. What if there are gaps in the numbers used? Using my example code, this quite real possibility is completely undefined -- entirely unacceptable for computers. Being undefined simply means that we have to define some behavior which will consistently generate correct results. For our needs this is actually quite easy, but to reveal our strategy we first have to resolve the first issue, that of varying length conditional blocks, the code we want to run when we match is not of uniform length. We can easily handle ranges that do not start at 0 or 1 by converting them to our range. 1430 to 1634 is the same as 0 to 203, so we can do a little basic subtraction and convert ranges, and again, all this is done in constant complexity!
We can solve the issue of non uniform lengths by using an intermediary list which has some information about how these different lengths impact the addresses at which each starts. As with everything on a computer, our list goes into memory with each choice being a memory address (4-8 bytes), more specifically, the memory address of one of the instruction blocks. We then can calculate a position into this list, and retrieve the memory address to jump to for the associated instructions. We can count on the memory locations being either a multiple of 4 or 8 in our list, and we now can use our simple logic above to jump to the correct locations. Let me illustrate this with code, using 4 as the multiplier to simulate a 32 bit system:
Our last task is defining an action when the values have gaps. This also allows us to handle a default case if we desire. The first thing to notice about the code above is that we have more items in our list than we have possible choices. We have added an address at offset 24, which would correspond to a 7th element. This is a default case, and corresponds to instruction address 18, where we just do nothing and then jump to the end (our code has no default case). Handling gaps is simple, we just add an entry in our LIST for that number, and set the value to be the instruction address after our code, in this case address 20. That way if we reach a number with no corresponding instructions, we can simply jump right past our code and continue processing. Again, all of this maintains constant complexity, our ultimate goal!
An astute observer may notice a flaw in this logic. What if our range is huge with a lot of gaps. What if we have conditions to match 0, 4, 10, 50, 945328, and 1 << 63? We have 6 possible values, but a range that goes from 0, to some obscenely large value (2^63). Not only does do our gaps occupy substantially more space than our valid options, but our list must have 2^63 entries! You could never hope to hold all this in the few gigabytes (or event terabytes) of memory available to us. Luckily modern compilers are smart enough to recognize the conditions which will lead to excessive wasted space like this, and will only compile code into jump tables when it can determine that the range and gap density is low enough to make a jump table efficient. In other words, not all conditional chains or switches will result in a jump table. Only those which will lend themselves to increased efficiency, at a cost that is not disproportionately large to memory, will be converted into jump tables.
I think that's about it, but if you think I left something out, notice any mistake, have questions, or anything at all, leave a comment!
Switches are great! Any time that you have several 'if / else if / else' clauses chained together to represent a number of potential paths, you have a great candidate for a switch. At the very least a switch statement will likely give you code that is much more clear and easy to read. They are concisely and explicit. When given the opportunity to shine, a switch can dramatically improve the speed of your code.
To see how switches can improve your code you need to understand how they are being performed 'under the hood', and to understand this we have to actually understand a little about the machine code that your source code is compiled into. We do this typically with an assembly language, but I will not assume you are fluent in assembly. Instead I will try to describe what is happening.
Everything in your program is stored in memory before making it to the processor. Instructions (commands), variables, strings, everything. Processors only understand rather simple commands, things like move this value from memory location A to memory location B, jump to instruction located at memory location B and continue processing from there, compare values in A and B, add two values, and other basic operations. These basic operations can be stringed together to accomplish some higher level task. For instance, you could write a basic assembly program to find the absolute value between A and B:
Address Command
0 Test if A is greater than B
1 If test is true jump to address 4
2 Subtract A from B (answer is in B)
3 Jump to address 5
4 Subtract B from A (answer is in B)
5 Return to caller
When we are finished running this code, the result is stored in B (A and B both represent registers, memory on the processor to store things we are currently operating on, usually 4 to 8 bytes each). We could have accomplished the same thing other ways, for instance, we could have just subtracted them, tested if the answer was less than 0 (negative), and if so we could have multiplied by -1, or even flipped the bits and added one. In programming there are almost always several ways to do the same thing.
Now that we understand a little bit about how assembly works we can talk about something called a jump table. A jump table is not something explicitly offered by assembly. Instead it is a logical programming construction, something someone made up as a way to use assembly to create something useful. To see how they work, it helps to lay out a scenario in which we would use them. For our example we will say that we have a game, and the player should be able to choose between 6 different classes. In C we might make some constants for each class to map the class name to a number which will be used internally to determine the class, in Java we may use an enum to do the same thing. Regardless of how we represent it, we will abstract our classes to numeric codes 0 through 5. Here's how this may look as an if/else if block:
Now that we understand a little bit about how assembly works we can talk about something called a jump table. A jump table is not something explicitly offered by assembly. Instead it is a logical programming construction, something someone made up as a way to use assembly to create something useful. To see how they work, it helps to lay out a scenario in which we would use them. For our example we will say that we have a game, and the player should be able to choose between 6 different classes. In C we might make some constants for each class to map the class name to a number which will be used internally to determine the class, in Java we may use an enum to do the same thing. Regardless of how we represent it, we will abstract our classes to numeric codes 0 through 5. Here's how this may look as an if/else if block:
if (chosenClass == WARRIOR)
hitPoints = 50;
else if (chosenClass == MAGE)
hitPoints = 15;
else if (chosenClass == WARLOCK)
hitPoints = 20;
else if (chosenClass == ROGUE)
hitPoints = 30;
else if (chosenClass == NINJA)
hitPoints = 40;
else if (chosenClass == BARD)
hitPoints = 10
;
Note: ALL CAPS represent enums or constants, numbers, 0-5.
A switch form of the above code might look like this:
switch (chosenClass) {
case WARRIOR: hitPoints = 50; break;
case MAGE: hitPoints = 15; break;
case WARLOCK: hitPoints = 20; break;
case ROGUE: hitPoints = 30; break;
case NINJA: hitPoints = 40; break;
case BARD: hitPoints = 10; break;
}
So I hope that by looking at my previous example of pseudo assembly, you can see the obvious way that this may be implemented in machine code. A few comparisons for equality and jumps to different addresses and you're done. But since I am writing about it, there must be a better way. Let me propose that instead of doing comparisons, we do some calculating. We have 6 total choices, and the range is 0 to 5, this is important. Note that there are no gaps, though this is not a necessity. Also recognize that once we have made a choice, we have our 6 possible outcomes and that each of these in machine code is two operations: one to set a memory location or register value, and another to jump past our choices (the equivalent of break), twelve total lines of code. We now can write something with this logic (the chosenClass is held in A, hitPoints is held in B):
Address Command
00 Multiply A by 2 and store the result in A
01 Jump to address stored in (A + 2)
02 Set B to 50
03 Jump to address 14
04 Set B to 15
05 Jump to address 14
06 Set B to 20
07 Jump to address 14
08 Set B to 30
09 Jump to address 14
10 Set B to 40
11 Jump to address 14
12 Set B to 10
13 Jump to address 14
Our code has a really cool feature to it. Where the logic in our C/Java style code using if/else blocks may have to test our condition up to six times, our assembly code is able to identify the correct code with two basic math operations and a direct jump to a new memory location. No matter which choice is made, we will never execute more than four instructions. In computer science we refer to this as constant complexity, and this is really the holy grail of algorithms. Anytime you see complexity go constant like this, you know you are looking at something special.So how is this working? First we look at how large the commands that we are jumping to are, in this case they are each 2 instructions, Set B to a value, and jump beyond the switch (to instruction address 14). To account for this we multiply A by 2. We then look at where these commands start, they start at address 03, so we add 2 to A (2 because we are using 0 as a number), and this defines where we jump to in memory. If the switch input is 3, we multiply by 2, add 2, and we find that we should be jumping to address 08, set B to 30, which properly corresponds to our ROGUE, which is correct for an input of 3.
But this is only a simplified version of a jump table, contrived to show just the basic logic and illustrate how jump tables avoid complexity and as a result can substantially speed up your code. Our logic here is great to illustrate those points, but it is terribly limited in application. What if the code that we executed on a matched condition was not uniform in length? Suddenly our trick of multiplying by 2 fails! What if the numbers don't start at 0, or 1, what if they start at 1430 and go to 1634, a range of 204? Our code again breaks. What if there are gaps in the numbers used? Using my example code, this quite real possibility is completely undefined -- entirely unacceptable for computers. Being undefined simply means that we have to define some behavior which will consistently generate correct results. For our needs this is actually quite easy, but to reveal our strategy we first have to resolve the first issue, that of varying length conditional blocks, the code we want to run when we match is not of uniform length. We can easily handle ranges that do not start at 0 or 1 by converting them to our range. 1430 to 1634 is the same as 0 to 203, so we can do a little basic subtraction and convert ranges, and again, all this is done in constant complexity!
We can solve the issue of non uniform lengths by using an intermediary list which has some information about how these different lengths impact the addresses at which each starts. As with everything on a computer, our list goes into memory with each choice being a memory address (4-8 bytes), more specifically, the memory address of one of the instruction blocks. We then can calculate a position into this list, and retrieve the memory address to jump to for the associated instructions. We can count on the memory locations being either a multiple of 4 or 8 in our list, and we now can use our simple logic above to jump to the correct locations. Let me illustrate this with code, using 4 as the multiplier to simulate a 32 bit system:
Address Command
00 Multiply A by 4 and store the result in A
01 Retrieve the value stored in our LIST at offset A, store it in A
02 Jump to address stored in A
03 Set B to 50
04 Set C to 12
05 Jump to address 20
06 Set B to 15
07 Jump to address 20
08 Set B to 20
09 Set C to 18
10 Jump to address 20
11 Set B to 30
12 Jump to address 20
13 Set B to 40
14 Set C to 100
15 Jump to address 20
16 Set B to 10
17 Jump to address 20
18 NO OPERATION (processor just does nothing for this instruction)
19 Jump to address 20
Our LIST looks like this:
Offset Value
00 03
04 06
08 08
12 11
16 13
20 16
24 18
So again, lets say we want to pick class 3, we multiply that by 4, and get 12. We look up the address stored at offset 12 in the LIST, and find address 11. We jump to the address stored in our list, 11 and we find we should set B to 30 and jump past to the end. Again, this corresponds to the ROGUE, number 03. Our algorithm is correct, and we now have greatly improved the usability of our logic at the cost of a little extra memory. We retain our constant complexity, which is key. No matter how large a list of conditions we are working with, we will never had to run more than three instructions to get there, and never more than 5 or 6 to be finished!Our last task is defining an action when the values have gaps. This also allows us to handle a default case if we desire. The first thing to notice about the code above is that we have more items in our list than we have possible choices. We have added an address at offset 24, which would correspond to a 7th element. This is a default case, and corresponds to instruction address 18, where we just do nothing and then jump to the end (our code has no default case). Handling gaps is simple, we just add an entry in our LIST for that number, and set the value to be the instruction address after our code, in this case address 20. That way if we reach a number with no corresponding instructions, we can simply jump right past our code and continue processing. Again, all of this maintains constant complexity, our ultimate goal!
An astute observer may notice a flaw in this logic. What if our range is huge with a lot of gaps. What if we have conditions to match 0, 4, 10, 50, 945328, and 1 << 63? We have 6 possible values, but a range that goes from 0, to some obscenely large value (2^63). Not only does do our gaps occupy substantially more space than our valid options, but our list must have 2^63 entries! You could never hope to hold all this in the few gigabytes (or event terabytes) of memory available to us. Luckily modern compilers are smart enough to recognize the conditions which will lead to excessive wasted space like this, and will only compile code into jump tables when it can determine that the range and gap density is low enough to make a jump table efficient. In other words, not all conditional chains or switches will result in a jump table. Only those which will lend themselves to increased efficiency, at a cost that is not disproportionately large to memory, will be converted into jump tables.
I think that's about it, but if you think I left something out, notice any mistake, have questions, or anything at all, leave a comment!
Tuesday, June 11, 2013
Keeping Fish and the Aquatic Nitrogen Cycle!
Location:
Salt Lake City, UT, USA
I don't know about the rest of you, but I love fish! I have been an avid enthusiast and hobbyist since I was young. My father and I were both introduced to the hobby at the same time, on a whim stopping in a pet store where the manager was a Cichlid fanatic.
One of the big barriers to keeping fish is knowing how to keep them alive. Unfortunately there is a lot of misinformation about keeping fish out there. Really, to keep fish, you only need two things besides the obvious of water and something to hold the water. You need to regulate the temperature (usually a heater is quite sufficient), and you need to filter the water. I've already said about all there is to say about regulating temperature. The only thing to add is that some tanks, usually with very powerful lights, will require chilling instead of heating.
There are several types of filtration: mechanical, chemical, and biological. Mechanical filtration is what most people think of when they think of filters. This involves physically trapping non-dissolved matter, usually in some sort of mesh. Though this does a great job of removing larger visible impurities, the primary benefit to mechanical filtration is that it can help keep your water more clear, which makes little difference to the fish but can increase your enjoyment in watching them significantly. Mechanical filters can range anywhere from a foam pad, to a micron filter further reduced by diatomaceous earth, the latter of which will even filter out mid-water algal and bacteria blooms, and even a method called foam fractionation which is used in marine aquariums.
Chemical filtration uses substances that absorb impurities from the water. The most common substance used is activated carbon, though other substances might be used depending on what chemical absorption is required. This can appear like biological filtration in some regards, you can remove some of the same substances that biological filters take care of, but it accomplishes this by absorbing the impurities. What most people don't understand about chemical filtration is that it is a band-aid for a real issue. There is no way to know how effective it is being, and once it fills up, it starts to leech chemicals back into the water. Chemical filtration, in my opinion, should only be used for very short periods of time between changing (8-24 hours) in emergency situations while you are working to fix the underlying cause of the issue.
So that leaves us with biological filtration, the core of keeping fish, and the core of biological filtration is something called a nitrogen cycle. Understanding this is really the key foundation of keeping anything alive that lives in water. It's all you really need to know to keep freshwater fish alive, and the first stepping stone toward keeping marine fish (salt water) and invertebrates, like corals and anemones. There are a lot of things that go into water, but true waste is primarily nitrogen, in the form of ammonia. Even uneaten food and dead organisms rotting are ultimately producing ammonia.
Ammonia is a poison to fish, and a high enough concentration will lead to all sorts of problems resulting in a dead tank. We MUST filter this out! We look to nature and find that there are aquatic bacteria which actually eat ammonia! Nitrosomonas basically eat ammonia and produce another form of nitrogen called nitrite(s). This form of nitrogen is not as toxic as ammonia, but is still a toxin in a fish environment, and must be removed. Again we look to nature and find another little aquatic bacteria called Nitrobacter which eats nitrites. Like it's cousins, Nitrobacter produces a nitrate as a byproduct. Nitrate is a substance you might have heard of before. It is a form of nitrogen commonly used in fertilizers, and is relatively non-toxic. This whole process of converting ammonia to nitrate is the nitrogen cycle, and basically all life depends on it.
Understanding this transfer from ammonia to nitrite, then nitrite to nitrate, is the foundation for understanding how nature filters water. But how do we apply this to our tanks? We look at how these bacteria live in water, and we find that they flourish best when they have a lot of surface area to colonize (rocks and sand perform this in nature) and a lot of oxygen, which in aquatics means swift moving water with a lot of surface agitation and a large area for exchange with the air. So in your aquarium you need good water to air surface area (go wide or long, not tall), have good water movement (most fish really want a lot more than most are given), and agitate the surface a little. It is not the air bubbles rising that help get air in water, it is their breaking at the surface that helps promote gas exchange.
Provide a lot of surface for bacteria to live on. Under-gravel filters of days-gone-by use the gravel as an actual medium for bacteria to grow, but those filters tend to trap waste under the filter plates and it never gets removed, and the flow is not strong enough to provide oxygen. This results in an anaerobic (non-oxygen breathing) bacteria colony, which you do not want. Instead, fill the tank with porous rock, with a lot of surface area, and get good Powerhead style pumps to keep things moving. Lace and some lava rocks work well, I suggest you consult with your local fish store to pick out the right rocks that won't leech bad stuff in your water. There are also a lot of filters that provide mediums for bacteria growth, from Marinelands power wheels, to bio-balls, and canister inserts, but I prefer to stick with the more time tested approach of having the system support itself (I do advocate sumps, separate tanks hooked together, to help increase water volume, and move some of the bacteria supporting rocks out of the main display tank).
Armed with well oxygenated water, good strong water flow, and plenty of rocks for bacteria to colonize, your tank will process its own waste into harmless nitrate that can be removed by aquatic plants, or with regular water changes. Failure to remove nitrate can lead to algae growth.
A few things to keep in mind. Bacteria takes some time to grow, and will not grow until there is an excess of food (ammonia and nitrite). When first starting out it is important to understand that you will have to cultivate this bacteria up slowly as the tank matures. This initial 'cycle' typically takes 2-3 months (but can vary from 1 month to a year, it all really depends a on a lot of factors, assume 2-3 months). To get ammonia in the tank you need something alive. Just one fish per 20-40 or so gallons will work well, assuming they are fairly small, less if they are larger fish, and you should already have a vision for what you want in the tank, so you should know which fish to get (least aggressive variety is best to start with). Get a test kit for ammonia and check it every 3-7 days. You should see the ammonia go up, and plateau for a week or two.
You should be like a month in already and be frustrated at how boring your tank is so far. That is OK! Just sit on that emotion and in the end you will come out with a work of art and a piece of pure beauty that everyone ooh's and ahh's over. Get a nitrite test kit and you should start to see that the nitrite is going up slowly. Keep an eye on the nitrite, and watch how after a week or two the ammonia has dropped down to fairly safe levels as the nitrite rises. You could add a few more fish at this point, but only a few, go really slow. The nitrite will go through the same pattern which ammonia did, it will plateau, then start to drop as new the bacteria colony grows to consume the excess resources. You want to get a nitrate test as well, and again, you can watch nitrate rise as nitrite fall to safe levels. Once this is done you have 'cycled' the tank, and have a full nitrogen cycle going on! Pretty cool huh?
Again, go slow from here. each time you add more fish that ammonia is going to spike again, more bacteria will grow, ammonia will drop as nitrite grows, then nitrite drops as nitrate grows. Too much of a swing and your fish will suffer, and likely get sick. So take it slow, give yourself a good year, maybe even two, to fully stock the tank, and you will end up with something you love instead of an algae infested death trap that you loath to look at.
If you have any questions about fish or fishkeeping, ASK!

Marlin the clownfish |
Marine aquarium filtration (salt water) |
Chemical filtration uses substances that absorb impurities from the water. The most common substance used is activated carbon, though other substances might be used depending on what chemical absorption is required. This can appear like biological filtration in some regards, you can remove some of the same substances that biological filters take care of, but it accomplishes this by absorbing the impurities. What most people don't understand about chemical filtration is that it is a band-aid for a real issue. There is no way to know how effective it is being, and once it fills up, it starts to leech chemicals back into the water. Chemical filtration, in my opinion, should only be used for very short periods of time between changing (8-24 hours) in emergency situations while you are working to fix the underlying cause of the issue.
Cleaner shrimp working on a polyp coral |
Ammonia is a poison to fish, and a high enough concentration will lead to all sorts of problems resulting in a dead tank. We MUST filter this out! We look to nature and find that there are aquatic bacteria which actually eat ammonia! Nitrosomonas basically eat ammonia and produce another form of nitrogen called nitrite(s). This form of nitrogen is not as toxic as ammonia, but is still a toxin in a fish environment, and must be removed. Again we look to nature and find another little aquatic bacteria called Nitrobacter which eats nitrites. Like it's cousins, Nitrobacter produces a nitrate as a byproduct. Nitrate is a substance you might have heard of before. It is a form of nitrogen commonly used in fertilizers, and is relatively non-toxic. This whole process of converting ammonia to nitrate is the nitrogen cycle, and basically all life depends on it.
Understanding this transfer from ammonia to nitrite, then nitrite to nitrate, is the foundation for understanding how nature filters water. But how do we apply this to our tanks? We look at how these bacteria live in water, and we find that they flourish best when they have a lot of surface area to colonize (rocks and sand perform this in nature) and a lot of oxygen, which in aquatics means swift moving water with a lot of surface agitation and a large area for exchange with the air. So in your aquarium you need good water to air surface area (go wide or long, not tall), have good water movement (most fish really want a lot more than most are given), and agitate the surface a little. It is not the air bubbles rising that help get air in water, it is their breaking at the surface that helps promote gas exchange.
20 gallon long, a lot of surface area. Notice the rippling effect, great for gas exchange! |
Armed with well oxygenated water, good strong water flow, and plenty of rocks for bacteria to colonize, your tank will process its own waste into harmless nitrate that can be removed by aquatic plants, or with regular water changes. Failure to remove nitrate can lead to algae growth.
Aquarium Sump |
You should be like a month in already and be frustrated at how boring your tank is so far. That is OK! Just sit on that emotion and in the end you will come out with a work of art and a piece of pure beauty that everyone ooh's and ahh's over. Get a nitrite test kit and you should start to see that the nitrite is going up slowly. Keep an eye on the nitrite, and watch how after a week or two the ammonia has dropped down to fairly safe levels as the nitrite rises. You could add a few more fish at this point, but only a few, go really slow. The nitrite will go through the same pattern which ammonia did, it will plateau, then start to drop as new the bacteria colony grows to consume the excess resources. You want to get a nitrate test as well, and again, you can watch nitrate rise as nitrite fall to safe levels. Once this is done you have 'cycled' the tank, and have a full nitrogen cycle going on! Pretty cool huh?
A little Blenny! |
Six-Line Wrasse |
If you have any questions about fish or fishkeeping, ASK!
Monday, June 3, 2013
Java - Pass by Reference Value!
Location:
Salt Lake City, UT, USA
To maintain a little bit of the geeky theme I have had going this last week, I wanted to talk a little about something I think is often misunderstood by those beginning in programming, and can be especially confusing to those moving to Java from other languages: the subject of reference vs value.
The primary place where I see this confused is in method parameter passing, but a better understanding of this can pay off in other areas as well. Also, one point of note, the terms function and method are interchangeable in the topic of computer science. Function is the traditional term, borrowed from mathematics, and used in languages like C. Method is simply an alternate term, and is favored in Java. The terms are interchangeable in this context.
Before we start to discuss the differences between them, we should ensure we understand the terms:
Value is what we would call an immediate. It is a section in memory where an actual value sits. It could be any data type, an int, double, char[], or even a struct. On the inside this is just a set amount of memory, contiguous bytes, the number of which depends on the size (or width) of the datatype being stored there.
Reference is simply a memory address which points to a 'value'. References are held in variables, in C/C++ they are called pointers, which is an ideal name for them, as they literally point to somewhere else in memory. I will continue to call them pointers here. The size of a pointer is simply the size (or again, width) of the memory addresses on your hardware (either 4 or 8 bytes typically), and will never be any different because it simply stores a memory address that points to some value.
Values are fairly straightforward to understand, but references are more difficult, mainly because references are also values, but special values, memory values. This means that references can point to other references, forming a chain which hopefully ends at a real object that has been allocated in memory. A reference which does not point to a valid allocated object in memory is considered a 'null pointer', it should not be used other than to write a valid address to.
The most succinct way I can think of these relationships is that a value is an actual value that is stored in memory. A reference is the address to a memory location where a value is stored. And a pointer is a special type of value, it only holds references (memory addresses), it's the manifestation of a reference.
One of the useful reasons for having references is to facilitate parameter passing, and that is primarily what I intend to address today.
Say you have a exponent or power function, it takes int A and B and returns a long, A to the power of B (A^B). On our imaginary system an int is 32 bits and a long is 64, so we have a total of 64 bits of input to our function, or 8 bytes of data, in computer speak this is basically nothing. To run this function we allocate 8 bytes, copy over the values that the method caller provided, and crunch the numbers. When it's time to return our long (which we had to allocate within our function) we again have a small amount of data and we simply copy it over to a location where the calling function knows to find it. This method of passing parameters to a function is called pass by value. We are literally passing the value of the variable provided to the function (or more precisely, an identical copy of that value).
Now say you have a sort function, it takes a linked list, and has no return value (void return type), it sorts the list provided in-place. When sort function is called what happens? Given the scenario in the previous paragraph we would assume that enough memory would be allocated to hold the linked list, the values held in the original list would be copied over to this new list, and we would sort the values within this new list. But this has a few issues. For one, we couldn't have a return type of void because without returning, we have no way of getting our copy of the list back to the caller. But even if we return our new list, what if our list is big? I mean really big? What if our list was more than half our available memory and we couldn't copy the whole thing to pass to the function?
Do we really require twice as much space as our list requires to call a sort method? Of course not! This is where references come in. Instead of copying our list over to our function, we can copy the address of our list which is what we mean by a reference, and then the method can access it without having to copy or move it at all. Remember, a reference is typically 32 to 64 bits, basically nothing compared to even a small linked list. This is referred to as passing by reference since we don't copy the actual value we want to operate on, we pass its memory address so that the function can work directly on the object.
So you can probably see why we need pass by reference, but why then do we bother with pass by value? The main reason that I am aware of is that references do not allow a caller to protect their data from a function. When a function is passed a reference there is a risk that the function will modify the input data which the calling function is still using, and this could cause the calling function to violate its assumptions about the state of its own data. When you pass by value you guarantee that none of your parameters are modified without your knowledge, as the called function is only operating on copies. There is also a question of optimization. If you have a function that takes a byte as input, passing by reference might use four times as much memory to pass parameters as passing by value, on a 64 bit system it might be 8 times. Just as we might want to optimize functions that handle very large inputs, we also should be able to optimize functions with small inputs.
Java does away with the question of reference vs value by making all primitive data types (char, short, int, long, float, double, etc...) pass by value, and all non-primitives pass by reference-value. Java has no pointers because every reference to a non-primitive type is always a pointer. There is no special syntax involved because this is simply how all non-primitive variables work, they are all pointers to the object. So what is passed to a method in java is a copy of that reference, or in other words a totally new variable (which under the hood is a pointer), with a copy of the address to the variable passed to the function. This distinction can be a little tricky to sort out at first, and difficult to describe in text, so you might want to read that a few times...
Some of the more insidious 'bugs' I have seen in beginning students Java code is related to not understanding how method parameters are passed under the hood. How modifying the contents of an object itself will be reflected in the calling methods copy of the input, but something like reassigning the variable to a new object entirely will result in no change to the object passed in by the caller. In other words there is a big difference between:
and
The first example above will result in the calling function having two lists that contain the same nodes in the same order, as we have simply performed operations on the original lists provided by the caller. In the second example we have assigned the address for object b to the variable named a. the variable a now points to the same object that the variable b points to. In fact, within the second method we no longer even have a reference to the list originally passed in as parameter 'a' at all! Our second method has not changed the original inputs as expected, it simply changed how the method references those lists, duplicating the reference to the list passed in as 'b' and losing the reference to the list passed in as 'a'.
This is a subtle but important distinction to understand if you hope to write functional Java code. If you are struggling, leave a comment and I might be able to help make it more clear!
I had intended to write a little about the stack and the heap (dynamic vs static memory) and why this distinction of value vs reference exists in the first place, but this post has already grown larger than I would like, so I will save that discussion for another day!
The primary place where I see this confused is in method parameter passing, but a better understanding of this can pay off in other areas as well. Also, one point of note, the terms function and method are interchangeable in the topic of computer science. Function is the traditional term, borrowed from mathematics, and used in languages like C. Method is simply an alternate term, and is favored in Java. The terms are interchangeable in this context.
Before we start to discuss the differences between them, we should ensure we understand the terms:
Value is what we would call an immediate. It is a section in memory where an actual value sits. It could be any data type, an int, double, char[], or even a struct. On the inside this is just a set amount of memory, contiguous bytes, the number of which depends on the size (or width) of the datatype being stored there.
Reference is simply a memory address which points to a 'value'. References are held in variables, in C/C++ they are called pointers, which is an ideal name for them, as they literally point to somewhere else in memory. I will continue to call them pointers here. The size of a pointer is simply the size (or again, width) of the memory addresses on your hardware (either 4 or 8 bytes typically), and will never be any different because it simply stores a memory address that points to some value.
Values are fairly straightforward to understand, but references are more difficult, mainly because references are also values, but special values, memory values. This means that references can point to other references, forming a chain which hopefully ends at a real object that has been allocated in memory. A reference which does not point to a valid allocated object in memory is considered a 'null pointer', it should not be used other than to write a valid address to.
The most succinct way I can think of these relationships is that a value is an actual value that is stored in memory. A reference is the address to a memory location where a value is stored. And a pointer is a special type of value, it only holds references (memory addresses), it's the manifestation of a reference.
One of the useful reasons for having references is to facilitate parameter passing, and that is primarily what I intend to address today.
Say you have a exponent or power function, it takes int A and B and returns a long, A to the power of B (A^B). On our imaginary system an int is 32 bits and a long is 64, so we have a total of 64 bits of input to our function, or 8 bytes of data, in computer speak this is basically nothing. To run this function we allocate 8 bytes, copy over the values that the method caller provided, and crunch the numbers. When it's time to return our long (which we had to allocate within our function) we again have a small amount of data and we simply copy it over to a location where the calling function knows to find it. This method of passing parameters to a function is called pass by value. We are literally passing the value of the variable provided to the function (or more precisely, an identical copy of that value).
Now say you have a sort function, it takes a linked list, and has no return value (void return type), it sorts the list provided in-place. When sort function is called what happens? Given the scenario in the previous paragraph we would assume that enough memory would be allocated to hold the linked list, the values held in the original list would be copied over to this new list, and we would sort the values within this new list. But this has a few issues. For one, we couldn't have a return type of void because without returning, we have no way of getting our copy of the list back to the caller. But even if we return our new list, what if our list is big? I mean really big? What if our list was more than half our available memory and we couldn't copy the whole thing to pass to the function?
Do we really require twice as much space as our list requires to call a sort method? Of course not! This is where references come in. Instead of copying our list over to our function, we can copy the address of our list which is what we mean by a reference, and then the method can access it without having to copy or move it at all. Remember, a reference is typically 32 to 64 bits, basically nothing compared to even a small linked list. This is referred to as passing by reference since we don't copy the actual value we want to operate on, we pass its memory address so that the function can work directly on the object.
So you can probably see why we need pass by reference, but why then do we bother with pass by value? The main reason that I am aware of is that references do not allow a caller to protect their data from a function. When a function is passed a reference there is a risk that the function will modify the input data which the calling function is still using, and this could cause the calling function to violate its assumptions about the state of its own data. When you pass by value you guarantee that none of your parameters are modified without your knowledge, as the called function is only operating on copies. There is also a question of optimization. If you have a function that takes a byte as input, passing by reference might use four times as much memory to pass parameters as passing by value, on a 64 bit system it might be 8 times. Just as we might want to optimize functions that handle very large inputs, we also should be able to optimize functions with small inputs.
Java does away with the question of reference vs value by making all primitive data types (char, short, int, long, float, double, etc...) pass by value, and all non-primitives pass by reference-value. Java has no pointers because every reference to a non-primitive type is always a pointer. There is no special syntax involved because this is simply how all non-primitive variables work, they are all pointers to the object. So what is passed to a method in java is a copy of that reference, or in other words a totally new variable (which under the hood is a pointer), with a copy of the address to the variable passed to the function. This distinction can be a little tricky to sort out at first, and difficult to describe in text, so you might want to read that a few times...
Some of the more insidious 'bugs' I have seen in beginning students Java code is related to not understanding how method parameters are passed under the hood. How modifying the contents of an object itself will be reflected in the calling methods copy of the input, but something like reassigning the variable to a new object entirely will result in no change to the object passed in by the caller. In other words there is a big difference between:
public static void copyList(LinkedList a, LinkedList b) {
a.clear();
a.addAll(b);
}
and
public static void copyList(LinkedList a, LinkedList b) {
a = b;
}
The first example above will result in the calling function having two lists that contain the same nodes in the same order, as we have simply performed operations on the original lists provided by the caller. In the second example we have assigned the address for object b to the variable named a. the variable a now points to the same object that the variable b points to. In fact, within the second method we no longer even have a reference to the list originally passed in as parameter 'a' at all! Our second method has not changed the original inputs as expected, it simply changed how the method references those lists, duplicating the reference to the list passed in as 'b' and losing the reference to the list passed in as 'a'.
This is a subtle but important distinction to understand if you hope to write functional Java code. If you are struggling, leave a comment and I might be able to help make it more clear!
I had intended to write a little about the stack and the heap (dynamic vs static memory) and why this distinction of value vs reference exists in the first place, but this post has already grown larger than I would like, so I will save that discussion for another day!
Sunday, June 2, 2013
Java file organization - part 2
Location:
Salt Lake City, UT, USA
So last post I showed how I built a few quick file utilities in Java. If you will recall, I have a very large music collection I copy to (currently) 8 usb thumb drives for use in my car. I do not want bands broken up between disks, and I want to avoid copying files that my car stereo does not read. So I wrote a few Java utilities to help me automate this task as my music collection grows.
This post I am going to show two more utilities, and then I'll show how I use these methods to build my disks. We will need a method to copy files and a method to delete files (to clean out the old files when I rebuild the disks). I'll start with the remove method because it is the smallest.
Basically we give it a folder, and it will remove all children. Our general strategy is to first collect all files and simply remove them. Next we iterate through toCheck and collect all the remaining directories and store them for removal as we continue to check them. We also add the current directory to the list to remove. This is similar to the logic we used in the last post to build our folder listing. Finally, we reverse toRemove so that we are removing the deepest nested directories first, and we iterate again to remove them all. This should all be rather straightforward.
The next method is a bit more complicated, but not too bad. We're using nio channels to perform the actual copying of the contents.
This method will return a long representing the size of the copied file (0 if the copy fails). The input is the file which is being moved, a disk number (used to name the target directory), a new root path, and the old root path. These paths are used to determine where to copy the file to. For instance, I have my music stored as /usr/home/me/music/<arrtist>/<album>/, so the oldPath would be /usr/home/me/music/. I want to store these disks to /mnt/<disknum>, so the newPath would be /mnt/. Everything else about the path would be preserved, so if our old file was /use/home/me/music/coolband/bestalbum/bestsong.mp3, it would be copied to /mnt/01/coolband/bestalbum/bestsong.mp3 (assuming this is destine for the first disk).
The meat of this method starts out by building the name for the new path, creating any necessary directories which contain this file, and creating the actual file. From there we connect IO streams to the source and destination files, and retrieve their channels for reading and writing. Once we have the size stored, the actual instruction to transfer the file is basic, we call the transferTo method on the output target, and give it the start and end position as well as the input source channel. The rest of the method is exception handling in case something goes wrong, and we finish by returning the size of the transfer. It is interesting to note that the file copy itself is only one line, and even with setting up the IO, it's still less than 10. Half if not more of the coding work involved is in admin work like generating path names and creating parent directories.
Armed with all these tools we are ready for the main function!
We start out by defining some limitations imposed by the hardware involved. The maxSize is a limit of my usb thumb drives (approx 64*10^9), and the maxDir and maxFiles is a limitation of my player. The rootDir is where we will build our disks from, our master list. The newLocation is where we will copy our files to build our disks.
For the meat of our method we define a few numeric types to track the size, number of files, and number of directories in each disk; we define a diskList to hold each listing of files which will make up each disk. Note that this is a nested list, meaning it is a linked list of linked lists. We have created one linked list that stores other linked lists of files, and we will add them to it as we assemble them. Next we create our first file list (representing a disk) and add it to the disk list. The last step before we start to assemble our disks is to gather a list of all the artists, which each have a separate directory under the rootDir, we call this list the headList, and we sort so that the disks are split up alphabetically by artist/band.
Once this is all setup we simply iterate through each artist/band and collect a list of their files, directories, and a total of the size. If they overflow our current disk we start on a new disk, otherwise, we add them to our current file list, and move on to the next. Once this is all done and every disks file list is assembled we remove everything from our newLocation (this assumes that you have old disks that were built that will be superseded by the ones about to be copied), and we copy all the files to their associated disks. Note that this last for loop would be faster if it iterated through diskList instead of using pop() because pop() incurs the overhead of un-linking a list node, though compared to the time that the copy itself takes, this is negligible...
There are lots of other things you an do to automate maintenance on large digital collections like this. Do you have any ideas?
Handy way to negate a 2's compliment number: (~n)+1
This post I am going to show two more utilities, and then I'll show how I use these methods to build my disks. We will need a method to copy files and a method to delete files (to clean out the old files when I rebuild the disks). I'll start with the remove method because it is the smallest.
private static void removeAllChildren(File root) {
LinkedList<File> toRemove = Tools.getAllChildren(root, false, true,
false, false), toCheck = Tools.getImmediateChildren(root, true,
false, false, false);
Iterator<File> i = toRemove.iterator();
while (i.hasNext())
i.next().delete();
toRemove.clear();
while (!toCheck.isEmpty()) {
File file = toCheck.pop();
LinkedList<File> temp = Tools.getImmediateChildren(file, true,
false, false, false);
toCheck.addAll(temp);
toRemove.addAll(temp);
toRemove.add(file);
}
Collections.reverse(toRemove);
i = toRemove.iterator();
while (i.hasNext())
i.next().delete();
}
Basically we give it a folder, and it will remove all children. Our general strategy is to first collect all files and simply remove them. Next we iterate through toCheck and collect all the remaining directories and store them for removal as we continue to check them. We also add the current directory to the list to remove. This is similar to the logic we used in the last post to build our folder listing. Finally, we reverse toRemove so that we are removing the deepest nested directories first, and we iterate again to remove them all. This should all be rather straightforward.
The next method is a bit more complicated, but not too bad. We're using nio channels to perform the actual copying of the contents.
private static long copyFile(File fromFile, short diskNum, File newPath,
File oldPath) {
String disk = Short.toString(diskNum);
if (diskNum < 10)
disk = "0" + disk;
String path = fromFile.getAbsolutePath().replace(
oldPath.getAbsolutePath(),
newPath.getAbsolutePath() + File.separator + disk);
File toFile = new File(path.replace(fromFile.getName(), ""));
if (!toFile.exists())
toFile.mkdirs();
toFile = new File(path);
try {
toFile.createNewFile();
} catch (IOException e1) {
// Choose an action here!
return 0;
}
FileInputStream source = null;
FileOutputStream destination = null;
long size = 0;
try {
source = new FileInputStream(fromFile);
destination = new FileOutputStream(toFile);
FileChannel sourceFileChannel = source.getChannel();
FileChannel destinationFileChannel = destination.getChannel();
size = sourceFileChannel.size();
sourceFileChannel.transferTo(0, size, destinationFileChannel);
} catch (FileNotFoundException e) {
// Choose an action here!
return 0;
} catch (IOException e) {
// Choose an action here!
return 0;
} finally {
try {
source.close();
destination.close();
} catch (IOException e) {
}
}
return size;
}
This method will return a long representing the size of the copied file (0 if the copy fails). The input is the file which is being moved, a disk number (used to name the target directory), a new root path, and the old root path. These paths are used to determine where to copy the file to. For instance, I have my music stored as /usr/home/me/music/<arrtist>/<album>/, so the oldPath would be /usr/home/me/music/. I want to store these disks to /mnt/<disknum>, so the newPath would be /mnt/. Everything else about the path would be preserved, so if our old file was /use/home/me/music/coolband/bestalbum/bestsong.mp3, it would be copied to /mnt/01/coolband/bestalbum/bestsong.mp3 (assuming this is destine for the first disk).
The meat of this method starts out by building the name for the new path, creating any necessary directories which contain this file, and creating the actual file. From there we connect IO streams to the source and destination files, and retrieve their channels for reading and writing. Once we have the size stored, the actual instruction to transfer the file is basic, we call the transferTo method on the output target, and give it the start and end position as well as the input source channel. The rest of the method is exception handling in case something goes wrong, and we finish by returning the size of the transfer. It is interesting to note that the file copy itself is only one line, and even with setting up the IO, it's still less than 10. Half if not more of the coding work involved is in admin work like generating path names and creating parent directories.
Armed with all these tools we are ready for the main function!
private static long maxSize = 64000500000L;
private static short maxDir = 998;
private static int maxFiles = 20480;
private static File rootDir = new File("/path/to/masters/");
private static File newLocation = new File("/path/to/copy/to/");
public static void main(String[] args) {
long currSize = 0;
int totalFiles = 0;
short totalDirs = 0;
LinkedList<LinkedList<File>> diskList = new LinkedList<LinkedList<File>>();
LinkedList<File> disk = new LinkedList<File>();
diskList.add(disk);
LinkedList<File> headList = Tools.getImmediateChildren(rootDir, true,
false, false, true);
Collections.sort(headList);
Iterator<File> i = headList.iterator();
while (i.hasNext()) {
File head = i.next();
LinkedList<File> files = Tools.getAllChildren(head, false, true,
true, true);
LinkedList<File> dirs = Tools.getAllChildren(head, true, false,
true, true);
short numDirs = (short) dirs.size();
int numFiles = files.size();
long size = 0;
for (File file : files)
size += file.length();
if (currSize + size > maxSize || totalFiles + numFiles > maxFiles
|| totalDirs + numDirs > maxDir) {
disk = new LinkedList<File>();
diskList.add(disk);
currSize = 0;
totalFiles = 0;
totalDirs = 0;
}
disk.addAll(files);
currSize += size;
totalFiles += numFiles;
totalDirs += numDirs;
}
removeAllChildren(newLocation);
for (short diskNum = 1; !diskList.isEmpty(); diskNum++) {
i = diskList.pop().iterator();
while (i.hasNext())
copyFile(i.next(), diskNum, newLocation, rootDir);
}
}
We start out by defining some limitations imposed by the hardware involved. The maxSize is a limit of my usb thumb drives (approx 64*10^9), and the maxDir and maxFiles is a limitation of my player. The rootDir is where we will build our disks from, our master list. The newLocation is where we will copy our files to build our disks.
For the meat of our method we define a few numeric types to track the size, number of files, and number of directories in each disk; we define a diskList to hold each listing of files which will make up each disk. Note that this is a nested list, meaning it is a linked list of linked lists. We have created one linked list that stores other linked lists of files, and we will add them to it as we assemble them. Next we create our first file list (representing a disk) and add it to the disk list. The last step before we start to assemble our disks is to gather a list of all the artists, which each have a separate directory under the rootDir, we call this list the headList, and we sort so that the disks are split up alphabetically by artist/band.
Once this is all setup we simply iterate through each artist/band and collect a list of their files, directories, and a total of the size. If they overflow our current disk we start on a new disk, otherwise, we add them to our current file list, and move on to the next. Once this is all done and every disks file list is assembled we remove everything from our newLocation (this assumes that you have old disks that were built that will be superseded by the ones about to be copied), and we copy all the files to their associated disks. Note that this last for loop would be faster if it iterated through diskList instead of using pop() because pop() incurs the overhead of un-linking a list node, though compared to the time that the copy itself takes, this is negligible...
There are lots of other things you an do to automate maintenance on large digital collections like this. Do you have any ideas?
Handy way to negate a 2's compliment number: (~n)+1
Monday, May 27, 2013
Java file organization - part 1
Location:
Salt Lake City, UT, USA
In my car I keep all my music on a collection of, currently 8, 64GB USB thumb drives. It's great! I can fit years worth of music into a case that would only hold 5-10 CD's or DVD's. I have a directory for each artist which contains a separate folder for each album. I can print out this list of artists/albums very easily, print and laminate it, and keep it in my car to easily find what I am looking for.
We start by defining a regular expression (regex) to match known good file types. A discussion of regex is beyond this post, but this basically says to match anything which ends in .mp3, .wma, .m4a, or .wav.
The method takes a File object which should be a directory which we want to retrieve the children for, the rest of the parameters are booleans for whether we should get files, directories, and if we should only include valid files or skip empty directories.
The real meat of this method is the root.listFiles(isDirectory) and root.listFiles(isFile) calls, which use the FileFilter objects we create below the method, which are called Anonymous Inner Classes. This returns only files or directories from the listFiles() method of File. If you are unfamiliar with the syntax I use to create these objects, search google for anonymous inner class. The only real complexit in this method is in handling the exclusion of empty directories, and filtering out unmatched file types, but the logic involved should be self-evident if you have done much coding.
This method is great, but what we really need is the ability to list files/directories to any depth. There are two ways to go about this sort of task, recursive, and iterative. Though many people are tempted to go for the elegance of a recursive solution, for arbitrarily large data sets the stack overhead makes this less than an ideal candidate, so I have instead written an iterative solution:
This method should be fairly straightforward. We are simply using a list named dirsToCheck to collect each directory we find, and we continue to remove and check each directory till the list is empty. This is known as a breadth first search (as opposed to a depth first search) because it examines each complete level in the hierarchy before moving to the next. This is a fairly simple method, most of the real work is being done in the first method we created.
In my next post I will show how I build my file lists to be written to thumb drives. This will include deleting and copying files!
This arrangement only really has two drawbacks: I have to split my music library into chunks less than 59.6GB (since disk manufacturers define a GB as 10^9, and your computer defines it as 2^30), and my car stereo will only support files of the type mp3, wma, m4a, and wav. Files like video's, album covers, and playlist (.m3u) files are useless to transfer for my car. Splitting my collection, efficiently, and weeding out the unneeded files would take me ages. And my collection continues to expand. I would be fumbling through menus, drag & drop, and shortcut keys forever doing it all by hand.
What are computers but the most versatile tools created by man? Why spend my days doing something that a computer can do so much faster? To my rescue comes Java, a handy little language with a short development time for small widgets like I would need, but the power and speed to efficiently manage a task as massive as I demanded.
The other day I took about five hours and wrote up a set of utilities to manage the vast majority of my music file management tasks, including building and copying my disk lists. I like little utilities like this, they help me keep my skills limber.
I figured this might be a good topic to write a little how-to on some of the stuff I did. I will assume you know some basic java as well as the LinkedList class.
I figured this might be a good topic to write a little how-to on some of the stuff I did. I will assume you know some basic java as well as the LinkedList class.
The heart to any file organization is being able to find all the children of a given directory. So to this end I wrote a helper method and two basic filters to do just that:
public static String validExtRegex = "^.*?(\\.mp3|\\.wma|\\.m4a|\\.wav)$";
public static LinkedList<File> getImmediateChildren(File root,
boolean getDirs, boolean getFiles, boolean onlyValidFiles,
boolean skipEmptyDirs) {
LinkedList<File> list = new LinkedList<File>();
if (!root.isDirectory())
return list;
if (getDirs) {
File[] children = root.listFiles(isDirectory);
for (int i = 0; i < children.length; i++) {
if (skipEmptyDirs) {
if (children[i].listFiles().length < 1)
continue;
boolean skip = true;
for (File f : children)
if (f.isDirectory()
|| (onlyValidFiles && f.isFile() && f.getName()
.toLowerCase().matches(validExtRegex))) {
skip = false;
break;
}
if (skip)
continue;
}
list.add(children[i]);
}
}
if (getFiles) {
File[] children = root.listFiles(isFile);
for (int i = 0; i < children.length; i++) {
if (onlyValidFiles
&& !children[i].getName().toLowerCase()
.matches(validExtRegex))
continue;
list.add(children[i]);
}
}
return list;
}
public static FileFilter isDirectory = new FileFilter() {
@Override
public boolean accept(File pathname) {
return pathname.isDirectory();
}
};
public static FileFilter isFile = new FileFilter() {
@Override
public boolean accept(File pathname) {
return pathname.isFile();
}
};
We start by defining a regular expression (regex) to match known good file types. A discussion of regex is beyond this post, but this basically says to match anything which ends in .mp3, .wma, .m4a, or .wav.
The method takes a File object which should be a directory which we want to retrieve the children for, the rest of the parameters are booleans for whether we should get files, directories, and if we should only include valid files or skip empty directories.
The real meat of this method is the root.listFiles(isDirectory) and root.listFiles(isFile) calls, which use the FileFilter objects we create below the method, which are called Anonymous Inner Classes. This returns only files or directories from the listFiles() method of File. If you are unfamiliar with the syntax I use to create these objects, search google for anonymous inner class. The only real complexit in this method is in handling the exclusion of empty directories, and filtering out unmatched file types, but the logic involved should be self-evident if you have done much coding.
This method is great, but what we really need is the ability to list files/directories to any depth. There are two ways to go about this sort of task, recursive, and iterative. Though many people are tempted to go for the elegance of a recursive solution, for arbitrarily large data sets the stack overhead makes this less than an ideal candidate, so I have instead written an iterative solution:
public static LinkedList<File> getAllChildren(File root, boolean getDirs,
boolean getFiles, boolean onlyValidFiles, boolean skipEmptyDirs) {
LinkedList<File> list = new LinkedList<File>(), dirsToCheck = new LinkedList<File>();
if (root.isDirectory())
dirsToCheck.add(root);
while (!dirsToCheck.isEmpty()) {
File file = dirsToCheck.pop();
LinkedList<File> childDirs = getImmediateChildren(file, true,
false, onlyValidFiles, skipEmptyDirs);
LinkedList<File> childFiles = getImmediateChildren(file, false,
true, onlyValidFiles, skipEmptyDirs);
if (getFiles)
list.addAll(childFiles);
if (getDirs)
list.addAll(childDirs);
dirsToCheck.addAll(childDirs);
}
return list;
}
This method should be fairly straightforward. We are simply using a list named dirsToCheck to collect each directory we find, and we continue to remove and check each directory till the list is empty. This is known as a breadth first search (as opposed to a depth first search) because it examines each complete level in the hierarchy before moving to the next. This is a fairly simple method, most of the real work is being done in the first method we created.
In my next post I will show how I build my file lists to be written to thumb drives. This will include deleting and copying files!
Friday, May 24, 2013
How I quit smoking
Location:
Salt Lake City, UT, USA
I loved smoking. I certainly was addicted, immensely so, but beyond that I was in love with the act of smoking. At any point in my day when I was bored I could have a cigarette, and ten minutes would be filled with addictive pleasure. It provided an oral fixation as well as something to do with my hands. I also love old fashioned things, trinkets; my zippos and cigarette cases were things I proudly carried and used regularly, long ago having become part of my personal style. On the psychological front, smoking is a way to instantly differentiate myself from the average straight laced Mormon, and growing up and living in Utah, there is value in such a display.
The key to me quitting was being able to do so without giving up the things I loved about smoking. I started using an E-Cigarette. I hadn't planned on quitting smoking entirely. I had an idea in my head that I would use the e-cigarette most of the time, but still enjoy a real cigarette once a week or so. But once I had become accustom to the e-cigarette I just lost my desire to smoke. There was simply no need when the e-cigarette took care of all the things I loved about smoking, and contained none of the drawbacks.
To those who are not aware of e-cigarettes, or those who have been caught up in all of the misinformation which surrounded them when they first became available, I will give you a quick rundown. An e-cigarette consists of two primary components, a battery, and something which is called an atomizer. An atomizer is a fancy word to describe a group of devices which break a liquid into a fine mist (atom in the classical sense). Inhalers are a type of atomizer. They work in many different ways, but in an e-cigarette the atomizer contains a small heating coil which literally works to make steam. There are two styles of atomizers. The more popular ones are built into a cartridge which holds the e-juice, the other style uses an atomizer that is surrounded by a mesh, designed to drip e-juice on directly.
The key to an e-cigarette, of course, is the e-juice, which is a mixture of nicotine with a base and sometimes flavoring. E-cigarettes work by vaporizing e-juice, or more accurately, by steaming it with the heat coming from the atomizer. The base used in e-juice is typically propylene glycol, a substance commonly used as a base for medicine and food (it's used in nebulizing inhalers). Some people are allergic to propylene glycol, so sometimes vegetable glycol will be used instead. Nicotine is a poison, produced by the tobacco plant to deter insects. It is the active ingredient in cigarettes, the substance you become addicted to. In small doses, with a built up tolerance (addiction), it is a relatively safe substance. It causes an increase in heart rate similar to the effects of caffeine. Usually a flavoring agent is added to the mixture. My go-to flavor is Chocolate-Vanilla-Hazelnut, but there are literally thousands of flavors made by different companies.
The choice between different types and brands of e-cigarettes is such a personal thing, but I will tell my own experience and hope that it is enlightening.
I started out buying an e-cigarette package from a local smoke shop. The brand was Smoke 51, designed to work with cartridges. They advertised a cartridge as being equivalent to a pack of cigarettes, with 5 cartridges costing about $12. I used this for two to three months, and was satisfied, but I did have two complaints, the cartridges would leak e-juice on my lips and it tasted disgusting, and I also found them to be imprecise, the way they wick the juice to the atomizer meant that toward the end of their life the cartridges started to give off wispy draws that were unsatisfying, even though the cartridge still contained a lot of juice. Also the batteries tended to last me 6-8 hours.
To deal with the short battery life I bought a much cheaper unit from the corner store as a backup (I don't recall the brand). This brand had a different design, using a separate atomizer and cartridge. The atomizer was a drip atomizer with a wire mesh, and the cartridge was designed to fit on it and wick to the mesh. After a few days of using the cartridges I got tired of the same issues with juice leaking into my mouth and started using this e-cigarette as a drip atomizer instead of using the cartridges. It worked very well, and I was much much more satisfied with this setup. I put away my Smoke 51 unit and never picked it back up.
I liked dripping onto my new e-cig, but the one I had gotten was as cheap as they get. It only held a charge for 4-6 hours, and the thing was crooked when you screwed the atty on the battery. Armed with the experience I had gained from vaping for nearly six months, I went online to find solutions to my previous complaints about my e-cigarette experience. I found a company called Volcano E-Cigarettes, which seemed highly recommended, and had a very active forums where their customers had built quite a neat little community. The product which drew me was a well crafted version of the typical e-cig you find, but came with two of them, along with a handy carrying case which also acted as a charger, holding enough juice to recharge both batteries a few times. I went with a cartridge like option, a clear tube which held a large volume of e-juice but I switched back to using a drip atomizer when I had issues with the tanks leaking.
After using this portable charging case which they called the Magma, I was sold, and began trying their different flavors, eventually hitting on the Choconilla-haze, which I have used exclusively for over a year now. I also became intrigued with a more high end e-cig which Volcano was selling, more appropriately referred to as a cigar due to its oversized battery, called Inferno. The inferno came in two styles, a larger battery, and a smaller battery that can charge via USB while in use, both styles came in the starter kit. I quickly gave away the larger battery as I found that the USB pass-through battery was enough to last through a long day, and was convenient to keep plugged in while at my desk using it. After using the Inferno for a few months I gave away all my other e-cigs and got a second Inferno battery as a backup. I continue to use the Inferno to this day.
Volcano recently came out with a new e-cigarette named LavaTube. It is much larger than the inferno, and much much larger than a typical e-cigarette. It uses a removable battery and has variable voltage, along with improvements under the hood. The variable voltage means that you can adjust how dense or 'hot' the draw will be. I tend to run mine at 3.8 volts. The removable batteries means that I can buy a number of them (I have 4) and charge two at a time in a separate charger. I can go backpacking for 4-7 days without needing to recharge this way. I use my LavaTube mostly at my desk, as it is a little large to carry in my shirt pocket, and I carry the Inferno out and about with me.
There are lots of different companies out there making cool e-cigarettes. So Volcano is not the only game in town, and I encourage you to go see what's available. I can say that with the few problems I have had (one unit that failed after a few months, and one shipment of the wrong flavor of juice), they were quick to correct the issues and very easy to deal with. I have placed maybe thirty or so orders with them in the past 2 years.
And last, if you made it this far, you are likely committed to quitting smoking. I want to explain a difference between the nicotine you get in a cigarette, and what you get with pure nicotine, like that found in e-juice. Cigarette manufacturers use a chemical process to change nicotine in cigarettes so that it is absorbed into your blood stream much faster than it would naturally. Because of this, cigarette smokers are trained to the immediate response (the fix) that smoking gives. Pure nicotine can take 5-10 minutes to fully absorb into the bloodstream, and as such, transitioning from cigarettes to e-juice can be less satisfying than people may expect. Cigarettes are a game of rewarding a desire, e-cigarettes are more about establishing a constant level of nicotine in your system. By the time you crave a cigarette, it is too late to get a quick cure with an e-cigarette. Instead, you will need to get used to using the e-cigarette maintenance, regularly using it to maintain the level which you are accustomed to.
In closing, I know a lot has been brought up about the health risks of e-cigarettes, so I thought I should address this, as I have done extensive research on the subject. The medical science on nicotine and propylene glycol is fairly extensive and both substances are safe as they are intended to be used in e-cigarettes. The flavoring used is typically USDA grade and food-safe. However there is one big warning here. KNOW WHERE YOU GET YOUR E-JUICE FROM. Buy from a non-name/fly-by-night operation and you have no idea what they put in there. Buy from a reputable source where you can confirm the ingredients and methods used in preparation, this is very important. This is another reason I went with Volcano, who have a large brick-and-mortar business located in the US, employing food-safe laboratories and methods. The same can be said for hardware, such as batteries and atomizers. Batteries can malfunction, and parts can be made with toxic substances. Again, ensure you are purchasing from a reputable manufacturer who care about what goes into their products.
Since quitting smoking, and taking up the e-cigarette I feel great, I don't stink all the time, and I expect to be around a lot longer! If you have any questions for me leave them in the comments!
Friday, March 29, 2013
Stop worrying so much about what other people do...
Location:
Salt Lake City, UT, USA
It's been a bit since I gave an update, but I wanted to take a moment to rant.
Something that I find terribly disconcerting in US culture is the increasing imposition of arbitrary morality on the population. I say this thinking about gay marriage, but this phenomenon can be seen in many aspects of our culture.
To lay a foundation for this it is important to be able to objectively view the issue. By granting gay couples the right to marry you have not removed any rights from anyone. Straight couples still enjoy the same rights as they always have. So you are effectively increasing the rights of the entire culture without impacting the existing rights afforded to all.
By denying gay couples the right to marry you also preserve the rights of straight couples to marry but you deny those rights to gay couples. So comparing the two outcomes you have an outcome that increases rights, and an outcome that restricts them. It is important to remind everyone that our country was founded on this idea of personal liberty and freedom, granted by right to all, the same rights in question, the freedom for two people to choose to marry.
So which country are we? The country that embraces freedom and liberty for all, or the country that likes freedom and liberty, as long as it's not too unpopular?
I personally think it is our duty to protect the rights of freedom for all people, despite our own personal views of them. I disagree with lots of people, but I would defend with my life their right of everyone to live their lives as they see fit, and I will continue to rage against those who attempt to impose their will on others. Don't like gay people? Then don't associate with them. It's really a simple tactic that has been working for people for thousands of years.
I use gay marriage as an example because it is current, but this phenomenon of imposing our beliefs on others around us pervades many aspects of our culture, and I think it harms us. If people stopped worrying about what other people are doing and spent that time working on themselves I think we would be much closer to a utopia.
Something that I find terribly disconcerting in US culture is the increasing imposition of arbitrary morality on the population. I say this thinking about gay marriage, but this phenomenon can be seen in many aspects of our culture.
To lay a foundation for this it is important to be able to objectively view the issue. By granting gay couples the right to marry you have not removed any rights from anyone. Straight couples still enjoy the same rights as they always have. So you are effectively increasing the rights of the entire culture without impacting the existing rights afforded to all.
By denying gay couples the right to marry you also preserve the rights of straight couples to marry but you deny those rights to gay couples. So comparing the two outcomes you have an outcome that increases rights, and an outcome that restricts them. It is important to remind everyone that our country was founded on this idea of personal liberty and freedom, granted by right to all, the same rights in question, the freedom for two people to choose to marry.
So which country are we? The country that embraces freedom and liberty for all, or the country that likes freedom and liberty, as long as it's not too unpopular?
I personally think it is our duty to protect the rights of freedom for all people, despite our own personal views of them. I disagree with lots of people, but I would defend with my life their right of everyone to live their lives as they see fit, and I will continue to rage against those who attempt to impose their will on others. Don't like gay people? Then don't associate with them. It's really a simple tactic that has been working for people for thousands of years.
I use gay marriage as an example because it is current, but this phenomenon of imposing our beliefs on others around us pervades many aspects of our culture, and I think it harms us. If people stopped worrying about what other people are doing and spent that time working on themselves I think we would be much closer to a utopia.
Thursday, May 10, 2012
Reflections on Yauch
Life gets hecktic and things like my blog suffer :(
But I wanted to take a moment to reflect on something.
When I was young, 8 years old to be exact, a neighborhood friend found a tape in the street. The year was 1987, and the tape he found was Beastie Boy's License to Ill. I still remember the excitement with which he told me about this music he had found, and we rushed up to his room where for the first time I heard girls, fight for your right, and no sleep till brooklyn. We played that thing till it fell apart, memorizing all the lyrics, and regularly annoying teachers with our renditions of this band none of our peers or teachers had heard of. The friend and I parted ways long ago, but my relationship with the music never ended.
Several years later, when I had grown a bit and started buying my own music, BBoys was at the top of the list. I fell in love with Paul's Boutique (still my favorite album), and check your head is still considered by me to be the most musically influential album of my lifetime.
It's been many years since I started my love affair with these 3 rockers from brooklyn, but that love has only grown.
I wanted to take a moment to share how much I appreciate all that the BBoys have done, for both me and our world, and the sadness I feel over the loss of MCA. The wold lost a true hero that day, and the loss effects us all. He will be missed and honored. My thoughts are with the band, and I sincerely thank you all from the bottom of my heart for being a constant ray of sunshine in my life!
But I wanted to take a moment to reflect on something.
When I was young, 8 years old to be exact, a neighborhood friend found a tape in the street. The year was 1987, and the tape he found was Beastie Boy's License to Ill. I still remember the excitement with which he told me about this music he had found, and we rushed up to his room where for the first time I heard girls, fight for your right, and no sleep till brooklyn. We played that thing till it fell apart, memorizing all the lyrics, and regularly annoying teachers with our renditions of this band none of our peers or teachers had heard of. The friend and I parted ways long ago, but my relationship with the music never ended.
Several years later, when I had grown a bit and started buying my own music, BBoys was at the top of the list. I fell in love with Paul's Boutique (still my favorite album), and check your head is still considered by me to be the most musically influential album of my lifetime.
It's been many years since I started my love affair with these 3 rockers from brooklyn, but that love has only grown.
I wanted to take a moment to share how much I appreciate all that the BBoys have done, for both me and our world, and the sadness I feel over the loss of MCA. The wold lost a true hero that day, and the loss effects us all. He will be missed and honored. My thoughts are with the band, and I sincerely thank you all from the bottom of my heart for being a constant ray of sunshine in my life!
Saturday, September 3, 2011
Ruby Horse Thief - Colorado River
Location:
Ruby Horse Thief Put-In - Colorado River
We had been planning a river run for labor day for a while now (thanks Jed!). At the last minute our plans for going up to Wyoming fell through, so with short notice we got in touch with some friends in Colorado and decided to meet roughly half way to do Ruby Horse Thief on the Colorado river. There were 5 of us from SLC, and 3 from just outside Denver.
We met up with our friends and camped just outside Fruita, and in the morning, after some shopping, headed out to the put-in opposite Loma. After unloading 5 of us left to shuttle cars to the take-out while the rest stayed and pumped up boats. We had a hard shell kayak, an inflatable kayak, a 16' raft, and a 13' raft, along with a load of gear.
Other than some slow leaking in the inflatable kayak, the first day on the water was great! I'd done several rafting trips, in whitewater and calm, and with oars and paddles. This was the first time I'd done the kayak though, and it was quite a different experience. You're much closer to the water, just hanging your arms over the sides you can easily touch, whereas on a raft, you're at least 18-24" higher sitting on anything inflated. They are also incredibly agile. You can turn on a dime, and run circles around a raft. Also by sitting lower in the craft, you have a little more sun protection, which after a few hours of being on a river, can be a valuable thing! The kayak's we had were made for two people, and they would have easily handled that, but they also easily work for one person plus a many day load of gear. I could have easily fit a large dry bag, a med-small bag, and still fit 5-10 gallons of water (I didn't do this, however, because my boat leaked, lol). The hard shell kayak had gear and supplies to last at least a week.
Other than the beautiful rocks and wildlife, the river here is nice and relaxing. Few and far between there are spots where you can intentionally get a little splashed, or bounce around just slightly, it's predominantly calm waters. This was exactly what we were looking for :)
Our campsite for the first night was MEE 2. When we got there, we found a brother and sister had poached our spot, but luckily it was a huge site, so other than having to unload out gear through their camp (where one of them rudely continued to lay in the path we had to traverse laden with gear), this was a minor inconvenience. We certainly informed them that we had registered the site, and planned on being up drinking and partying, and they didn't complain, so to each their own I guess :)
The campsite was nice. It consisted of a few trees, the largest of which we setup under, and a backdrop of cliffs with a large overhanging bowled-out cave. We did some hiking around in the morning, it was a beautiful area.
The second day was a short one for the river. We only had to go 4 miles to our next camping spot at Black Rock 10. This is the last site in the blackrock area, and was recently formed when the water level opened up a new site (BR 10 used to be BR 9). It was a good spot other than the lack of shade, but luckily we had a canopy to help us make a few small patches of sun relief.
The rocks in this area are unlike anything I've run into before. According to mike they are intruded magma which has the sandstone eroded away. In high water the intricate rocks can become dangerous, creating sink that can suck a boat right in. It also provides excellent diving platforms, and we watched several people jumping off them into the waters below!
The third day on the water was our last, and we made the final 8 miles a lot faster than we expected. It was sad to unload and prepare to leave the river, but we secretly were all dreaming of our hot showers and soft beds after the drive home!
We met up with our friends and camped just outside Fruita, and in the morning, after some shopping, headed out to the put-in opposite Loma. After unloading 5 of us left to shuttle cars to the take-out while the rest stayed and pumped up boats. We had a hard shell kayak, an inflatable kayak, a 16' raft, and a 13' raft, along with a load of gear.
Other than some slow leaking in the inflatable kayak, the first day on the water was great! I'd done several rafting trips, in whitewater and calm, and with oars and paddles. This was the first time I'd done the kayak though, and it was quite a different experience. You're much closer to the water, just hanging your arms over the sides you can easily touch, whereas on a raft, you're at least 18-24" higher sitting on anything inflated. They are also incredibly agile. You can turn on a dime, and run circles around a raft. Also by sitting lower in the craft, you have a little more sun protection, which after a few hours of being on a river, can be a valuable thing! The kayak's we had were made for two people, and they would have easily handled that, but they also easily work for one person plus a many day load of gear. I could have easily fit a large dry bag, a med-small bag, and still fit 5-10 gallons of water (I didn't do this, however, because my boat leaked, lol). The hard shell kayak had gear and supplies to last at least a week.
Other than the beautiful rocks and wildlife, the river here is nice and relaxing. Few and far between there are spots where you can intentionally get a little splashed, or bounce around just slightly, it's predominantly calm waters. This was exactly what we were looking for :)
Our campsite for the first night was MEE 2. When we got there, we found a brother and sister had poached our spot, but luckily it was a huge site, so other than having to unload out gear through their camp (where one of them rudely continued to lay in the path we had to traverse laden with gear), this was a minor inconvenience. We certainly informed them that we had registered the site, and planned on being up drinking and partying, and they didn't complain, so to each their own I guess :)
The campsite was nice. It consisted of a few trees, the largest of which we setup under, and a backdrop of cliffs with a large overhanging bowled-out cave. We did some hiking around in the morning, it was a beautiful area.
The second day was a short one for the river. We only had to go 4 miles to our next camping spot at Black Rock 10. This is the last site in the blackrock area, and was recently formed when the water level opened up a new site (BR 10 used to be BR 9). It was a good spot other than the lack of shade, but luckily we had a canopy to help us make a few small patches of sun relief.
The rocks in this area are unlike anything I've run into before. According to mike they are intruded magma which has the sandstone eroded away. In high water the intricate rocks can become dangerous, creating sink that can suck a boat right in. It also provides excellent diving platforms, and we watched several people jumping off them into the waters below!
The third day on the water was our last, and we made the final 8 miles a lot faster than we expected. It was sad to unload and prepare to leave the river, but we secretly were all dreaming of our hot showers and soft beds after the drive home!
Subscribe to:
Posts (Atom)