What's not new at lucketts.net? This is the news archive page for 2006.

31 Dec 2006– Last note of the year (I hope). The network is currently up and running 100%. All access points are up and all of our servers and internal links are happy. Local issues like interference and bad home routers appear to be resolved for now, but they could come back. We'll be on call for emergency support over the holiday.

We did have an issue Friday & Saturday that was very hard to track down. It turns out our commercial RF link to Ashburn had a problem with interference on the Asburn side, causing a sort of random slowdown when starting a download. Big downloads were not affected, particulary those we use to test network speeds. But a webpage like CNN, with 100's of little downloads, was taking longer than normal. At least one customer brought it to our attention, but when we tested dowload speeds with big files, everything zipped along. At about 11 PM Saturday we isolated the problem, (our connection to the internet). By 3AM we found specifically what the problem was (the RF link interference between us and our Internet service provider). Sunday about 2 PM we finally got the correct fix in place and the network appears to be zipping along again, for both big files and web pages with lots of small files.

Have a safe New Years Eve.

24 Dec 2006– The SpamAssassin Program has been updated to work better. It can now catch those pesky imaged based SPAM selling drugs and stocks. The new program also checks the SPF record, which is an anti-forgery mechanism implemented by most of the modern, cutting edge ISPs. SPF gives us the ability to determine if an email claiming to be from a particular domain is really from that domain. As most SPAM has a forged return address, the ability to detect forged addresses greatly helps in the SPAM war.
The only problem to date is with emails sent from some Real Estate Agents. They are using a tool that provides listing to a client, and adding the agent's return address to the resulting email in the FROM field. Since the email is really from sometool@somerealtycompany and the FROM address is listed as joe-agent@yahoo.com, SPF says the email from address is forged. If you find that you are getting mail incorrectly marked as SPAM with the cover email mentioning SPF records, please let us know.

We're trying to add an anti-virus program, but that may still take awhile. We'll announce when it's running.

22 Dec 2006– We were installing a new version of SPAMAssassin today to improve the SPAM detection percentages and for awhile it was turned off. It looks like today is not a good day to turn off SPAM checks, even for a few minutes. We got blasted with SPAM immediately. As the upgrade was still in progress and couldn't be stopped, we turned on the old, pre-Nov version of SPAMAssassin to about an hour. So today you got to experience no SPAM checking from 4:50PM to 6:30, the old SPAM (pre-Nov) check from 6:30 to 7:05, and the new SPAM check aftyer about 7:05PM. You can tell the difference by looking at the hidden headers on your email. The old SPAMAssassin will be listed as version 2.55, the Nov SPAMAssassin will be listed as 3.1.0, and the new version is 3.1.7.

With the new version, we will be able add tools to cacth the sneaky stuff. Particulary the SPAM that come in the embedded pictures in your email. Normal SPAMAssassin cannot read the pictures, so it was very hard to tell SPAM from non-SPAM. The new SA will be able to OCR the pictures and pick up key words. Should improve SPAM detection and reduce false alarms. Currently, the email get a few points just for having the embedded pictures. In the future, existance of the picture will not increase the score, but content will. That way the email with picture signature blocks will not be marked as SPAM.

The new SA will also put us on the path to restore the individual whitelists that many of the customers use. Its sort of complex to give each customer a whitelist when we use a standalone anti-SPAM server, but we're working on it.

Still haven't heard back from many customers on the increased speed. We have tested everywhere we can, but it's not the same as the experience of the users. Our testing tells us it alot faster than before, but your perceptions are what's important. Do you think it is faster or slower the past two weeks?

17 Dec 2006– Oops. An "allow" got changed to a "deny" on the main firewall. Everything stopped, but now fixed. Human error on our side, should not have happened. Sorry for the 24 minute interruption. We've been trying to keep system changes to the weekends and very late night to minimize impact on people. But this weekend I thought the impact would only be a few seconds of reboot, not blocking all traffic for over 20 minutes.

8 Dec 2006– On Moday a customer reported problems with large file downloads ending up corrupted. After some tests, we found that we could repeat the problem and it appeared related to our web cache files. We cleared the cache of the corrupted files, and everything worked again. We could not identify a systemic reason for the corruption. We assumed it was a file system problem with a unix machine that shut down improperly, and that clearing the bad files cleared the problem. A second customer on Friday mentioned the same problem, so we're looking to see if the same problem came back, or if it is unrelated. So far, we cannot replicate the Friday problem, but that doesn't mean it isn't there. If you notice problems with large file downloads, please let us know. The most common instance of this problem is trying to open a downloaded zip file greater than 4 MB in size. The zip file can be opened as a folder with no problem, but when you try to remove a specific, repeatable file from inside the compressed folder, or try to unzip the entire folder, you get a zip file corrupted error. If you see this problem, please email us with the particulars(when downloaded and from what link) so that we can identify the source of the corruption. Thanks...

1 Dec 2006– If a new internet provider offers you a service plan beating our price, please give us a call before deciding to switch. We will try to match any offer. We will be very flexible so not to lose a customer already established with us. It may not be possible for us to match all offers, but we'll give it our best.

If the phone company were to suddenly provide DSL 1 Mbps service at $15/month, we could not match that. We would rather keep all customers, but sometimes we just cannot compete with certain offers. We are upgrading our network to enable us to compete with FiOS (the Verizon Fibre Optic network), but we are not quite there yet. If FiOS moves into your neighborhood, we will match the price of their lowest speed service, but we cannot, currently, come close to their higher speed plans. It is our plan to compete at the 15 Mbps rate as soon as possible.

If you are satisfied with our service, but are offered a better price plan, please give us a call before changing.

27 Nov 2006– We left the higher bandwidth setting in place on Monday and many people turned on P2P software. Three users can bring the network to a crawl if they turn on unlimited P2P. Remember, while you may only be downloading 1 or 2 or 3 Mbps, the rest of the world is scanning you for files. Even if you limit the number of connections allowed, the scanning is not restricted and we get hundreds of attempts to read your PC for what files you are sharing... per second.

Our network, like a cable network, is a shared bandwidth system. That means it is like a "party line", but the individual conversations go by so fast you seldom notice the line is being used by others. Unless the others are using P2P software. P2P software ties up the party line by never letting it go - ever. Other customers have trouble getting their 1 short call in while the P2P software is placing 100's per second. When we increase bandwidth limits from, say, 1 Mbps to 3 Mbps, normal customers get a snappier response as the peak download speeds go up. Normal customers have lots of short surges of bandwidth use followed by seconds of dead time, then more surges. This allows other customers to squeeze in between your surges and get their downloading done in their own quick surges. P2P software never takes a breath, it just downloads one surge after another. To make it worse, the P2P software on your PC invites the rest of the world to constantly ask you to upload files you've already downloaded. Each of those requests is another short surge.

If a customer activates P2P software during the day or early evening without placing limits on its speed, we will place limits. P2P is, by the terms of service of Lucketts.net, not allowed. If the P2P user places reasonable limits ion the software, we may let it run. But an unlimited P2P session will usually get capped by our routers to anywhere from 0 to 500 Kbps. Some of the caps are automatic, some are by hand. All have to be removed by hand, and we sometimes forget. Overnight we will sometimes let it run faster is nobody else is being impacted. In any case, if your P2P is impacting another user, we will cap it or block it.

We really try to accomodate the desires of all of our users, but prioritization is required. 1 happy P2P user does not balance out 100 unhappy, slow users. This happened today at 6PM. We were making some adjustments to the network, and received a few trouble calls. While backing out the adjustments, assuming that we had broken something, we discovered a customer running 2 Mbps of P2P traffic that was almost shutting down an entire access point. After capping the P2P, and re-fixing the stuff we thought we had messed up, the network started humming along again. After midnight, if any of us are still up, we'll consider removing the cap.

23 Nov 2006– Wi-Fi security is starting to be a problem. Not Lucketts.net security, but the customer's home wireless routers. Some customers insist on activating their router's security modes and some have their wi-fi security activated by a "helper" program without their knowledge. All end up calling us when they disconnect themselves. Our suppoet varies as to the circumstances.

Both AOL and McAfee have software that will manage the security of your router for you, often changing settings without your knowledge, even if you have your router password protected. We do not recommend using the wi-fi management portions of these programs. If you do insist on using them, we seriously limit the free support given over the phone to get your network repaired. We will help you remove the automatic features that force security features on you, but if you choose to retain the security settings, our support will be provided at an hourly rate.

Some customers have been intentionally implementing WEP, WAP, and other variations of wi-fi encryption on their home networks. Good for them if they have the knowledge to maintain an encrypted wireless network. It's not really hard, but it is specialized knowledge that is a little beyond remembering which box to unplug to reset your connection. In every case to date, the customers that have activated wi-fi security and called us to fix their network when it went down, did not need the security and actually ignored our advice not to use it.

There are cases where wi-fi security is recommended. The most common example is in crowded environments such as appartment buildings where it is very easy to pick up the signal from your neighbor. It's a target rich enviroment, and very easy for someone to link into an unsecure network. But most of our customers are in the countryside, far from the nearest neighbor. The risk there is very low. And even if someone tapped into your network from a laptop while sitting in their car parked in your driveway, all they could do is steal some of your bandwidth. Because you all run firewalls on all of your PCs, right? And if you transmit secure information from your PC, you use secure web sites and transmission protocols, right? In the rural environment, using WEP and WAP encryption is very likely to knock you offline because you messed it up, but there is almost no credible risk of someone hacking into your data because you failed to encrypt your network. In general, we do not recommend using wi-fi encryption. There are special cases, but those are special and they know who they are.

Let us put it another way. You, the consumer, CANNOT use off the self equipment & software to secure your inhouse wi-fi network in such a way that I could not break into it. You CAN prevent me from accessing your data by running firewalls on your machines and using https sites for your sensitive data. Well, actually, there are a number of firewalls I could break as well, but that's a different security issue. 802.11n is marginally harder to break into that 802.11g, so eventually things will improve in wi-fi security. If you really need wi-fi security that cannot be broken by the neighbors kid, we can provide special hardware and software for the job.

I don't want to tell someone not to use security if they really want it. Farthest thing from my mind, as wi-fi security is a real money maker for us. If you choose to enable wi-fi security on your router, our free phone support ends at establishing that we have a valid connection to the Lucketts.net radio mounted in your roof or attic, and that the ethernet signal from that radio reaches your router. After that, phone or inhome support is at our normal hourly rates, currently $60/hour for simple support.

At least one customer read a magazine and based on that knowledge, they respectfully disagreed with our recommendations not to implement security. Again, no problem. We are happy to help the customer get exactly what they want. We can design and support low, medium and high security networks, as well as just plain spooky networks. We will offer our option as to the best security/cost/reliability trades. but the customer is boss and gets to choose what makes them happy.

A brief sumamry of our qualifications: Tammy is currently a consultant to the National Reconnasaince Office (NRO), Directorate of Security, where she is responsible for evaluating and approving the security of computer and communications networks handling data ranging from unclassified to Top Secret to SAR. She was recently selected as Security Employee of the year for her expertize and outstanding performance. While in government service, Steve led the design and installation team for a 15K seat SAR secure world-wide computer network for the NRO, and served as Director of Network Policy for the NRO CIO. Steve was the primary author of the IT standards document outlining the minimum security and performance standards required of all NRO IT networks. As a Senior Staff Systems Engineer at Lockheed Martin, he was retained by the Dept of Justice to fix the Virtual Case File software suite that the FBI had horrible trouble with a few years ago. Directly following the VCF efforts, as a Principal System Engineer at Lockheed Martin, Steve was responsible for a classified assessment of wi-fi security for the CIA, evaluating security shortfalls and exploitation opportunities for both Wi-Fi and Wi-MAX technologies.

18 Nov 2006– Steve snapped at a customer tonight, and that is bad. We should never be snappy, even if our time is being abused. Here's the situation: A customer calls on a weekend evening to inform us in no uncertain terms that our network is down. We asked if they had rebooted, but got unclear answers. We suggested specifically that they reboot both PC and their router, and received feedback indicating that they had rebooted a router that their PC was not connected to. We asked them again to reboot their PC and the router that the PC was connected to, and explained how the routers have to be up first to assign addresses to the PC. There was still confusion or frustration on the customer side, and it was suggested that they may be better served by reading their router's manual before calling their ISP for free advice on how to manage their internal network. The customer thought he was being talked down to, and he was at least partially right. That wasn't nice of us.

Picture this: you have satellite TV service. You have a TV connected to a VCR that is connected to the satellite receiver. The TV gets a horrible picture, so you call the satellite company for tech support, and they determine that sat receiver is working and sending a good signal to the VCR. When they give you "free" advice on how to fix "your" VCR, you apply the advice to a VCR not even connected to the TV having the problem, and get upset when they suggest you try it on the VCR actually connected to the TV having the problem. After fussing with them ahile about how this can't be the problem, they suggest that maybe you should read the VCR's manual. Yes, the customer was right, he was being talked down to. Steve is not nearly good enough with people to not let some frustration through when people are willing to burn free tech support time but not willing to learn how to operate "their" equipment. We are seriously considering the following process to deal with trouble calls that are clearly concerning problems with the customer's internal network: We can determine quickly during a call if the internet connection to a house is up. The radio we install to provide the connection can be contacted from the network operations center to demonstrate that an internet connection exists. In the future, we will do that first. If we demostrate a connection, then the problem is with the customers internal network (99% of the time anyway). We will suggest, without charge, that they reboot the equipment they own and manage. If it is a slow day, we may even give them some tips on how to estup their internal network. If that doesn't work, we will offer to a) provide premium phone support or b) make a house call to fix their internal network for them. If either of these options turns up a problem with Lucketts.net that is causing the outage, then there will be no charge.

Please let us know what you think about this idea. We'll work on being nicer regardless. This message will be posted in the forums to see if it generates any discussion.

12 Nov 2006– This weekend did not help our uptime statistics. It started with a Verizon cable break sometime between 3 and 5AM on Saturday. Of course this was the day I slept in, so I didn't know that the network was offline until 9AM. Our website and mail were till available to customers to check status, which we started updating after about 9:30. Sorry for the delay. Verizon said it was an auto accident that broke the line. Last Friday, per Verizon, there was also an accident that broke the line about 5AM. hmmm.... Saturday's break was fixed about 11AM. It only took 3 hours to return all the phone messages to let people know they were back up.

We have a data connection to another provider that does not use the Verizon/AT&T T1s. It is online now, but not fully integrated into our network yet. When Verizon went down, we were able to shift some of the web traffic onto the other circuit so that basic browsing was possible. (http only) The plan is to have the alternate provider be able to pick up all of the traffic from our primary lines if they go down. We're at least a month from that yet, as it requires alot of coordination between us and the two service providers. If we do something wrong, it is actually possible to break the internet, so we are taking our time to do it right. (not our internet, but "the" internet). We couldn't break it for long, but we sure could get some big companies very unhappy with us. So we have to put away the "BGP for Dummies" book and get a consultant to make sure it goes right the first time.

Once the Verizon line came back up, it took us longer to correct our fixes and workarounds than it took to put them there in the first place. The rest of the day was spent catching little things like changed IP addresses and routes. By Saturday evening we had the network back up and running smoothly, or so we thought. Sunday afternoon one of the servers rebooted automatically, which it is supposed to do in certain circumstances, so that's OK. But when it rebooted, some of Saturdays settings were restored rather than the normal defaults, and half the network lost web browsing ability. Took 30 minutes for us to notic ethe traffic pattern change and find the problem.

Sunday morning started well, even with the rain. Fixed one customer's internal wiring, a 1 hour job that somehow took 3 hours. Also met with 2 new horse boarders, nothing to do with the internet other than the web-cams in the stalls. Then around 1 PM, AP23 went offline, and that took out AP16 & AP19. Turns out a wire was disconnected at the remote site. We'll work on tidying up the installation so that will not happen again. About this time, we took a trip to Landsdowne Hospital, a side story, and picked up some Popeyes on the off chance we may not have time to cook.

At about 5 or 6, the real problems started. The AP8 link started going bad with ping times climbing over 1000 ms. This link is a dual 48 Mbps link. It feeds all of Taylorstown and Lovettsville. It is our newest upgrade, and we are very happy with it. The problem was traced to a single user that was literally flooding the network with a problem. We had to shut the user down and reboot the access point to clear the problem in a timely manner. The link has been solid since. We'll reserve further detail until we have a chance to analyse the problem further and put preventative measures in place. No need to hand someone the "turn off Lucketts.net" key.

As soon as AP8, AP18, AP28, AP14, AP14-2, AP21, AP25, AP27, and AP 35 came back online from the link problem, AP6 started developing its own ping time problem. Times started going up and then pings started getting dropped. AP6 feeds AP24 & AP31 soanother large set of customers started loosing service. AP6 has a backup route in place though, so we connected through that to see what was going on, and prepared to route all traffic through the back door to get everyone back online. Sort of like the Star Trek thing they always do to fix problems, dragging little squares around on the computer displays that don't have any words that the TV can actually read. Only we didn't get a chance to change polarity - but I digress. AP6 was actually working just fine, but its normally outstanding signal strength was gone. Looks like we have a wind-blown antenna that is pointing the wrong direction. We have some cools things we can do with the RF spectrum to narrow the bandwidth and boost the signal, so that stabilized teh connection, at a slow 6 Mbps, until we can go out int he morning and figure out which antenna needs to be fixed.

As soon as the AP6 was happy, our 2nd ISP dropped offline, around 6:45PM, and a big chunk of bandwidth went away with it. Our software detected the failure and automatically re-routed traffic back to our primary connection. All smiles here as something finnaly worked as it was supposed to, and we would have time to find out what was wrong and fix with without unhappy customers calling us. Then we discovered that AP4 dropped offline as well. AP4 and the 2nd internet link are co-located on our tower. The outside tower. The wet, slippery, wind-blown, dark, outside tower. And the flashlight batteries went dead. Thought about calling Daniel in to fix this one. After string up extension cords and shop lights, we found a cover that blew off of a box, exposing electronic gear to the rain. After a bit of draining and replacing a few parts, the link and AP4 came back up.

Remembering the now-cold Popeyes chicken, it looked like a good time to leave the office for a few minutes to eat. Wrong. Turns out water was still dripping through the components in the tower box, and AP01 promptly went down as soon as the office was empty. Usually our AP numbers do not carry alot of significance, but AP01 is, well, AP01. A really big chuck of the network dropped offline. This time an alert user gave us a call to point out that the primary transmitter on our site was down, no biggie. Another trip out into the rain, but at least the shop lights were still strung up. Up and running in a few minutes. All in all a pretty normal Sunday.

Down time on Sunday for most of these problems was less than 10 minutes, although some customers got hit with more than one problem. The big downtime was Saturday morning when Verizon went down. We have the fix for that designed, and the long lead items are already in place. We should be able to withstand Verizon failures with no noticable impact to our customers within a couple weeks.

24 Oct 2006– Everyone is gruntled today so we're all in a better mood. Lots of people are using the internet on these last few cold nights, so usage levels have shot through the roof, catching us off guard. We were not expecting to need the new bandwidth until December, giving us quite awhile to play with it before putting it online. We'll speed that process up now, and try to start offloading network load by end of this week.

When loads max out, the network latency grows significantly. That means slower web browsing than if we had less bandwidth. So during the peak hours between 5PM & 9PM we will be setting bandwidth caps at 1Mbps to keep everyone running well. We'll open it up more later when the traffic dies down. It's possible that we could lower the cap even further in extraordinary circumstances to preserve the net. Did have to do that last night for 30 minutes. It's amazing how many calls we get from people that want to know why they aren't getting the 3 Mbps peak speeds they get during the day. We allow for peak speed greater than 1 Mbps because this is a shared bandwidth network, and once in awhile it will get busy and your share will be less than 1 Mbps. To help make up for it, we usually allow customers to run faster than 1 Mbps if there is excess capacity. Problem is that some folks feel cheated when they don't see the 3 Mbps, let alone 1 Mbps.

22 Oct 2006– Had a disgruntled client today. He apparently had a very low opinion of us and our network, partially based on his experience that we have not returned his trouble reporting phone calls for months. I gather he believes nothing we tell him about locating the source of the problem he has been experiencing. All understandable, as a sign of an incompetent service would be a failure to return calls.

If anyone else is trying to contact Lucketts.net, they most definately should use the phone number we post on our web site, 703.349.3661. Or they should use the phone number attached to all of our emails, 703.349.3661. They most definately should not use the number we had last year, as that has been replaced by the new number, 703.349.3661. If you were to by mistake dial the old number, you will get a message telling you to dial 703.349.3661 if you want to contact Lucketts.net.

We take customer support pretty seriously, even though we sometimes do not live up to expectations. If you dial our office number, 703.349.3661 in case you didn't catch it, we will respond ASAP if we don't answer it live. It is possible on occasion that we will lose a post-it, or vonage will crash with the message, or some exotic forwarding problem will send your call into limbo. But we most assuredly will not intentionally fail to return your call.

The network is more loaded right now that we like, with usage at 100% for at least an hour each evening. During those max usage times, peak speeds will drop below 1 Mbps for everyone. Right now (11:05PM) peak speed on the network is 750 Kbps, as it is a very busy Sunday night. we are working two solutions for this, both near term. The first is additional bandwidth. The new frac DS3 line is installed and we actually got a connection to the internet from it this weekend! We still have to debug and tune it, and then we can start transfering small groups of customers to that circuit. That will give us an immediate 50% increase in bandwidth, with much more available as we grow. When we get the BGP (geek stuff) routers up and running, the network will be dual homed (more geek stuff) so that we can move parts of the network from one connection to another without making you change your IP addresses. The BGP solution will be sometime in Nov.

The second way to fix bandwidth problems is to have better software on the firewall. Currently, when the network is under heavy load, we adjust the speed caps "by hand" to keep from overloading the current connection to the internet. One of the problems with that is that everyone gets the same speed cap, and the other is that by-hand process is slow and prone to human error. We are writing software that will automate the process so that speeds lower and raise to keep the network running smoothly under load and as fast as possible when the load decreases. Most of the parts are there, and we just have to put it all together. Probably sometime in Dec.

10 Oct 2006– AP 2 was upgraded today to state-of-the-art. It can support very high speeds to the end users, signal strenght is higher, reception is better, and it can now relay high speed links to other APs. AP2 now provides the backup link to AP4 (and vice versa) that we broke yesterday. The backup link runs at 500 to 600 KBps actual throughput. The primary links run at 3500KBps, but that's only because we have them turned down. We haven't opened them up wide yet, but they should be able to support 6000+KBps with no problem. We are working to provide 6000KBps to each of our access points. To put that in perspective, thats faster than FIOS.

Also, the new RF frac DS3 is further along. It is now connected to the provider on the other side so we can test basic funstionality. Their router is not completely set up yet so it will be awhile before we can start putting customers on that link.

9 Oct 2006– AP 4 was upgraded to a state-of-the-art radio set today. It has much more capability than before. Specifically, it can deliver over 20 Mbps to the home user. The primary link is restored to AP 4 as well, and it now can support over 24 Mbps. If only the rest of our systems worked fast enough to support that . Unfortunately we broke the backup link while fixing the primary. I'd rather have the primary anyway.

8 Oct 2006– I moved the 8 Oct rant to the forums page where there is more room and a place for others to respond.

6 Oct 2006– The Chapel Lane access point (AP4) lost its link back to us last night around 1030PM. We were watching as the signal started dropping and eventually died after about 10 minutes. There is a backup connection to that access point, but it still requires a manual changeover. Took about 10 minutes to get it running. Last night was to be install and test of the automatic system to re-route, but now we'll have to wait for dryer weather before trying that.
We now have a DS3 class wireless link established to Ashburn. The radios are connected and running now. Routers on both sides still need to be installed and configured, but that should only take a few days. We'll be checking out the reliability of the signal for the first month, and testing various redunant connection protocols to see which works best for us. Probably be a few weeks to a month before the new bandwidth filters out to the customer base.
When the new connection is operational, we will be able to increase bandwidth with very short turnaround times. That will allow us to start ramping up again in size. The new connection will also allow us to offer VoIP services, and dedicated bandwidth plans, and eventually the 15 Mbps class service to the home.

25 Sept 2006– We attempted last night to activate path redundancy on all of the access points in the Lucketts area last night. We Installed a dynamic routing protocol and turned it on all at once. With our current infrastructure, that would have kept everyone running when/if their primary link went down. I monitored & tested it for 2 hours last night, ending about 4AM. All was good. I left instructions for the staff to let me sleep and not to call unless something is really messed up. They called at 8:20. *sigh*
At 830, there we are looking at a dynamic routing table with about 200 routes changing in real time and trying to identify the one problem route at a glance. Even worse was trying to find the source of the bad route, as each access point generates its own 200 entry table and shares with all the other access points. We ended up just turning it off and restoring the static routes.
We'll try a more gradual approach next time by activating small sections one at a time to let problems pop up in smaller groups where they are easier to catch.
Then this afternoon one of those failures the new system would prevent occured, costing about 2 hours of frantic debugging and re-routing by hand. The new system would have fixed it in 5 seconds. We will get it implemented over then next few nights.

23 Sept 2006– Messed up a router today at noon and shut off 1/2 the network for 10 minutes or so. The offending device was shut down and we will wait to work on it until after midnight tonight. We spent the past week working on improving backbone connections to each of the access points. We have 4 APs (of 32) left on the old system, and will try to get to them this week or so. Once the upgrades are complete, we'll be rolling out a prototype 10 Mbps service to a few test locations.
The new bandwidth is closer to reality. We have purchased the hardware, and the details have been worked out with our provider. We will be starting with a fractional DS3 line via wireless to Landsdowne. This link will give us a 2nd path to the internet so that if another posthole digger gets careless on Lucketts Rd, we will not lose our connection. We will also be able to increase this link up to 45 Mbps as needed. Slow but steady advancement.

7 Sept 2006– We have ordered the equipment for the new high speed link. Too bad AT&T couldn't wait a few more weeks to work on their routers. If we have the new link in place we would have a backup route to the internet. The new link will start at 3 Mbps and we can scale it up to T3 speeds. As a result, we are seriously looking at offering higher bandwidth ooptions, and dedicated bandwidth options. I've agreed to pricing with provider and hardware is paid for. Shipping, installation and testing will take weeks, but by the end of September we should no longer be dependent on one path to the internet.

2 Sept 2006– We have worked out the technical details for the new bandwidth and are ordering the equipment on Tuesday. ETA for increase is now days (or weeks) but not months. Getting new bandwidth is much more difficult than I would have predicted.
The rain has returned, and so have some random troubles that we thought had gone away. When we look long enough at a place having reboots or lost connections, we usually find someone running P2P software. If we shut down the P2P connection, the trouble usually clears up. Not a 100% cure, but the P2P does appear to be related to many of the problems. A troubling act is that many of these P2P users were up before the rain with no impact on the network. There may be a combined effect here, and possible some piece of hardware that is having a problem aggravated by rain and sensitive to P2P traffic loads. Will be working on it all weekend.
The account monitor that restricts browsing of folks that are 2 months past due was a wild success even with a few bugs. Just the top 7 outstanding accounts paid over $2000 in overdue invoices on Friday. Sorry we had to do that, but the past due bills were getting out of hand. We've turned of the browser intercept for the rest of the holiday weekend, but will re-activate it during the work week while our office is staffed. That way we can respond in real time to restore service when someone calls us to clear up an overdue payment.

23 Aug 2006– We are about to sign a deal to provide additional bandwidth to the network. The new link will not use the existing T1 lines, hence the network will have more than one path to the internet. That is good. We'll start with and additional 3 Mbps, but be able to increase to 20 Mbps with the same hardware. With a minor hardware upgrade, we can increase to 40 Mbps. This means that we will be able to provide another tier of service, something around 15 Mbps peak speeds. And we will be able to offer dedicated bandwidth to business customers who don't want to share their connection.
If all goes well, we should have the increase available in September. Updates to follow.

13 Aug 2006– No naps this weekend. Saturday the bad client router brought down 2 access points, then Sat night the transformer at my property failed, causing the power outage, now AT&T has a massive system failure that knocks us offline. Plus one of our intrusion detection systems went offline Friday, and one of the health and status monitors broke right as AT&T came back up. All taken in stride. It's a good thing that the rest of the family is at Disney World this week. It's never good to see this, except for the end part where the system comes back up. Now for the 24 voicemail messages piled up. By the way, we use Vonage for 2 of our phone lines, so if the T1s are down, Vonage just takes a message. We have a land line for those times the T1s fail, but that is for the AT&T router diagnostics.

12 Aug 2006– It always does this on weekends and Friday nights. Power is out right now in Lucketts and we are on generator backup. Not sure how widespread the outage is, but remote access points could start going down if it is extended. Save any important work often.

This morning we had a short outage on the primary access point for Lucketts, and the Montressor West access point attached to it. Turns out a client radio failed in such a way that it also jammed the access point. We were able to fix the jamming in a few minutes and then went out to replace the faulty equipment.
While the power was out the equipment problem came back. Looks like the customer router is the source, as our radio was replaced already. So it was a little hectic here managing the power and working the jamming again.
When the power came back up, we switched everything back over to utility power. First turn the generator swith from generator to off, then after a minute to let compressors settle, switch to utility. Except I switched back to generator and walked away. 20 minutes later AP0 and AP9 went back offline because the already low batteries ran down.

4 Aug 2006– Brand new AP6 installed to fix a reboot problem. Trying a different mix of equipment to avoid the same problems. Turns out the high power radio cards draw too much current from the backplane causing instability. When we replace the high power card with a lower power one, we improve reliabilty and actually results in increased transmit power. A case where more was too much.

We have a technical problem I could use help with. Routing problem that has stopped me for 3 hours so far today and I have few ideas how to continue. I'll detail the problem in the tech support forum where discussion can occur. Please throw in your 2cents if you have an idea that can help.

1 Aug 2006– Access Point 3 went offline this morning around 10AM. We detected it immediately, tried restarting it via remote control but failed. It took a site visit to get everything running again. Power supply UPS problem, new equipment installed to prevent the same problem from occurring again. The failure mode was an odd onoe though, it should not have happened in the first place. The APs have 2 mechanism to force a reboot: the first is if the CPU stops processing then the hardware is power cycled; the second is if a connection to the internet is lost, then the CPU does a soft reboot. Neither mechanism worked today, which means the CPU was locked up enough to not run the connection check, but still pass the CPU lock test. It looks as if a power hit got through and mostly crashed the CPU. As we keep telling customers, powers drops and spikes are the biggest problems for everyone, and good UPSs are a requirement if you want the equipment to stay up. That even applies to us.
This afternoon a line supporting an overhead cat5 cable snapped on our property. Nothing went down (except the cable), but it was really hot stringing a rope overhead as a temporary support for the cable until we can get a steel cable to make a real fix.
We'll start posting the down times of the various access points as public information. While most problems are individual in nature, AP problems affect everyone and are of interest to many. The format and frequancy will vary as we figure out the best way to do this.

28 July 2006– We're getting 5 calls a day that all go the same way: "We've been offline for [12 hours, all day, last x days]. We haven't changed anything so why is your service down and what are you going to do to fix us right now?" Some people say it nicely, others have used short, easy to understand words. Please reboot your equipment before calling us as that will almost always fix the problem. Power does flicker and lightning does strike. Computers, routers and radios all can shut off when that happens. Please remember the simple operators manual for problems - reboot and if that doesn't work call us.
And please don't want days to call us. There are occasional problems that are temporary, but they almost always clear up in a minute or 2. If you are down for more than a few minutes, please don't wait to give us a call. Some customers wait two days to call us then find out that the reboot we recommend fixes their router problem and brings them back up. We always try to fix the same day, but some days are busy, such as after a bad set of storms, and we do prioritize. We try to factor in the customer need, but there are some general guidelines. For example, fixing access points is always the first priority, and, well, fixing the user that waited days to call us is not.

21 July 2006– Wow...10 days since last post. Growth is a little slower now as we rebuild portions of the network that are strained. Most of the resources the past month have gone into fixing existing customer's problems. Top priority, even if some people may not see it. As much as we hate having customers call to report network problems, please don't be shy. Too many time people call to tell me they have been down for 2 days and what are we going to do to fix them now. 8 times out of 10 a reboot on their side fixes it, 1 time in ten we can adjust something from the office to fix it, so there is no need to be offline for days in 90% of the problems. 10% of the time it's something more serious and requires a house call or visit to a remote access point to fix. Please reboot everything once if you cannot connect, and if that doesn't fix it give us a call.

We can monitor your connection for big problems, and sometimes we'll call you before you call us to work it. But often you will be slow or have occasional glitches that we cannot detect easily unless we focus on the individual. Don't assume we know about the problem if you're offline.
New info on some of the recent problems: we have been monitoring the network for a long time now trying to make a correlation between some unknown cause and the slowdowns that hit certain access points. Those APs are mostly in Taylorstown/Lovettsville but there is also one serving Evans Pond and Lucketts that shows exactly the same symptoms. To clear the problem we have found the only reliable method so far has been to adjust the access point to transmit on a new channel. That appears to leave the problem behind and everything works again. For awhile, then the problem catches up hours to days later. We can treat the symptoms, but finding the cause has eluded us.

This week we had a few lucky correlations and we may have found at least one of the causes. Some customers have a signal that is borderline, particularly when it rains. We usually adjust their radio connection speed downward so that the radio can stay locked to the access point even when it rains. Some customers are locked at that slower rate and some are set to a variable setting that adjusts the rate as needed to keep up with conditions. Those customers usually run at a high speed, but occasionally drop to lower speeds due to local conditions such as wet leaves, blowing trees, or interference. We've found that when a small subset of these customers upload or download above certain thresholds, the subnet they are on starts to have problems. This appears to happen only when the client is locked on a slow radio speed but tries to send high bandwidth. The primary problem caused by this is that certain brands of routers belonging to other customers start failing and either pass their internal addresses out onto our network or act as a gateway for every address they see. Both actions start confusing the rest of the network, bringing it to a crawl. Sometimes the slowed network then causes the original problem client to slow down or stop their download, and the network starts clearing again and gets backup to normal speed. Sometimes the original problem client just locks itself up and refuses to slow down. When this happens, the only way to clear the network is move it to a new channel to get away from the stuck client. After an indeterminate period of time, this cycle repeats.

Here's the proposed cause: The original client is in the vicinity of RF interference, usually a cordless phone but sometimes their internal wireless router. When their connection is locked on a high radio speed, the packets seldom collide with the interference and there is no problem. But when their speed is reduced by environmental problems, the same packets take much longer to transmit and their chances of being hit by an interference pulse is 10 times greater. Many of the packets that make it out are garbled, and they provoke the rest of the problems. If we block the small set of originating customers when the problem is growing, the problem goes away. If we do nothing, the problem usually grows until one of the originators just locks up, then the access point is blocked from further transmissions on that channel. Blocking the original problem source at that point is useless, because they are not blasting out garbage regardless of their connection status. The only option then is to move to a different channel until the original problem is rebooted to stop the blasting.

We thought for awhile that any customer slowed to the lower speeds would be a problem, but when we tested we found that only some of the slowed customers exhibited the problem. We have a current list of about 6 customers that originate this problem. All have cordless phones and wireless router co-existing in their houses. We had 12 on the list, but when we started shifting them to a different radio frequency or locking them on faster speeds, they stopped being a source of the problem. We've spend a few days watching this and so far the theory holds. We are starting a nice test tonight that put 5 of the trouble sources onto an access point of their own so that they cannot affect their neighbors. Then we'll work on their signals to get them out of the low speed connection realm. We should be able to do this without any customers noticing after tonight, other than those that see us installing new radios at their homes. Fridays morning's 2 minute outage at 12:15 AM was caused by adjusting the new access point to only work for the "carriers" of the problem. Everyone else will be on the normal AP. This coupled with installation of new radios and some new access points to deliver stonger should clear up the problem.

11 July 2006– Been very busy with storms. Trying to catch up, so the last few days of nice weather have been great. We had 5 access points go down since 4 July, but all have been fixed within a couple of hours. Good thing is that we can upgrade them as they are being fixed, so we can continue to move forward upgrading the network. Except for AP 21 off of Milltown Rd. Lightning killed brand new equipment twice in 4 days. We'll have to toughen it up, as it's getting expensive to keep replacing equipment.
Access point 9 in Montressor (Leesburg Crossing) was upgraded Monday so that the occasional slowdown should now be a thing of the past. Next in line will be Chapel View to that its speeds improve.

29 Jun 2006– The storm keeps causing problems. The Temple Hall Access Point has had 2 power supplies burn out since Saturday. The heavy rains on the full leaves are making the worst possible signal conditions for the customers with marginal signals. We're performing lots of equipment moves and upgrades to get individual performance back up to speed. The 2 new access points around Taylorstown are better, but still not perfect. They don't reboot because of interference any more, but the frequency we had to shift to doesn't penetrate wet leaves as well as the standard frequency, so the heaviest rains pose a dilemma for us -- have low signal because of rain or change back to the full strength frequency and reboot one an hour because of a neighbor's RF interference.
Some of the AP may have also been damaged by a static discharge produced by one of the Saturday storms. A number of the APs have been acting stranley since then. It could also be that some client equipment was damaged. We are replacing much of the data backbone with new gear to make sure it's not causing the network problems.

27 Jun 2006– Sorry for the break in posts. The start of summer has kept us extraordinarily busy. Many customers are endign up with much worse leaves than we allowed for, so many radios are being moved or upgraded to improve signal strength. If you experience repeatd drop-outs or outright service stop, please give us a call ASAP. There are no extended outages on the system itself, knock on wood, so most times a problem is with your signal. We have many ways to correct or upgrade weak signals. We are replacing or moving 1 or 2 radios a day to fix problems with existing installations.
The interference source on 900 MHz near Taylorstown has caused us to virtually abandon that frequency. All of our custoemrs north of TAylorstown Rd with 900 MHz radios will have their gear replaced ASAP with new improved 2.4 and 5.8 GHz equipment. We have 2 new, micro access points deployed to make up for the 900 Mhz coverage that we are dropping. The new radios are faster, so everyone should be happy once we get everything working.
Normally, a new access Point takes 2 or 3 weeks to check out and get running well nough to put customers on. The 2 new ones in Taylorstown were turned on with live customers already in place, so there has been more pain than normal getting them stable. May was tough for the customers on the new AP on Lovetsville Rd, but it looks like we have stabilized that one. The newest AP in Taylors Valley was rebooting every few hours until we thought we fixed it 12 days ago. It ran without rebooting once for over 8 days and everyone was happy. But it started rebooting again Saturday evening. Either it suffered storm damage, or a neighor with a 5.6 GHZ ordless phone was on vacation for those 8 days. We're calling neighbors now and replacing the equipment as soon as it is dry enough to climb steep roofs.
Looks like one more micro AP will be added near Taylorstown Meadows to cover the customers there. One that is done and the APS are cross-linked, stability should greatly increase for the Lovettsville side of the network. Once that down, we'll be working the refitting of the Lucketts side APs. THere are still 3 access points here that need to have the 5.8 GHZ backbone link installed, adn 1 AP that just needs a general upgrade that had been promised a few months ago but was OBE'd by the spring problems.

10 Jun 2006– The new router and networks modifications on Thursday night worked out well. The bridging load was significantly reduced, making the network noticably work better. Tonight we will do the same modification to the access points serving aylorstown and Lovettsville areas. With this modification, performance issues due to network architecture should be mitigated and everything should just work better.
The changes tonight will involve multiple router reboots and new routing tables. If we're careful, it should work the first time like it did Thursday night. Worst case will require driving out to each access point and resetting them by hand to get them back online.

9 Jun 2006– New primary router installed this AM in preparation for new bandwidth, and to replace existing routees that were suspect. With the replacement, one more possible cause for some of the seamingly random performance issues will be eliminated. Working minor installation glitches with the new router. If you are having troubles, please give us a call.

The office has been doubled in size here so Daniel now can sit inside with the rest of us. We'll probably move servers inside as well.

25 May 2006– Sitting here listening to the generators run while power is down. Power company says minor problem will be fixed soon. Power company personal contact says new construction messed up and crashed the power grid. Power should be back on around 6 to 6:30. Some of the very remote access points will not last that long, so there could be local outages until power comes back up in those locations. Discovered that 2 of our 3 generators do not provide electronics quality power. Looks like we will be buying a new regulated generator soon. We'll post the old generators in the "for sale" forum to see if anyone wants them.

New Taylors Valley access point is up and running. We started shifting customers over to it this AM. This will help balance the network nicely. If you ar ein Taylors Valley, Taylorstown Meadows, or Meadow Vista we will evaluate your location and possibly move you to the new AP.

15 May 2006– T1s circuits went down around 2 AM today. AT&T said they were running unannounced updates to routers until 6AM. I called them again just after 6AM when everything was still down and they have no clue what they did, but now all the phone lines are down as well. Vonage being down I understand, but the copper lines are down as well.

Vonnage is notified of the phone problem, and I was able to escalate and get them to commit to a tech visit this AM. They gave me a estimate of fixing the lines of 1130AM or sooner. I'm not sure what they based that on, but its the only time provided so far. I'll post more info as it becomes available. AT&T owe us a status update about now.

11 May 2006– Most of the recent posts here have been discussing problems, and occasionally new customers are concerned by volume of problems reported. We realize that some people may be put off by these reports and choose not to go with Lucketts.net. So be it. We will continue to post issues and their resolution as often as we can. Any system issue impacting multiple users will be fair to discuss here. What we will do though is remember to write about more of the cool stuff. As to the amount problems we have, we currently have 230+ customers and get an average of 5 phone calls or emails per day telling us of a problem a customer is having with the network. 2 of those will be user or user equipment failure, 2 will be network issues fixed by our tech in real time from the network center, and one will require a house call. Sometimes the tough one will take a few days to solve.

On Monday we signed the paperwork to get a DS3 line installed. AT&T says it should be installed in 45 days. We are starting with a fractional line providing 15 Mbps, but we can upgrade that on demand via software to increase our bandwidth up to 45 Mbps as we grow. As a result, we will be able to offer dedicated bandwidth solutions to customers who prefer not to use the current shared bandwidth. Bandwidth levels will range from 256Kbps to 15 Mbps and will require special equipment for the point to point link.

We have been working behind the scenes preparing the network to support 15 Mbps of shared bandwidth to the home as a new service tier in most areas. Initial deployments will probably be this fall. New equipment will be required to use get the 15 Mbps, and there will probably be a hefty charge, but the pricing is TBD as we're waiting for the equipment to be mass produced.

Three new access points have been added in the past couple weeks to get better signal to some existing customers and pick up new areas we haven't been able to reach before. The (very) high speed backbone inter-connecting the access points is expanding, with only 5 of the 25 access points left to upgrade. As an access point is added to the (very) high speed backbone, interference and network troubles for those customers is nearly eliminated.

Our problem access points used to be AP5, AP7, AP9, AP10, AP16 and AP19. AP5 to AP16 have been upgraded to the newest hardware, and no longer turn themselves off when other network areas have problems. Their continuous uptime is only broken now by long power failures. Except for AP10 on Stumptown, as it is still sometimes impacted by heavy rainfall. AP10 is our unsolved challenge. We have 2 or 3 options to get it linked into the (very) high speed network either in May or June. AP19 is to be replaced with new hardware at a new location on the next available installation date, and that should bring it up to operational status. Currently, no customer on AP19 is being billed for service as it is not operational, so we would really like to get it to operational status.

Enough for tonight, we have to prepare for possible power outages as the thunderstorms are moving in. More on Friday.

5 May 2006– 2 Access points were upgraded to very-high speed today. Of course, the 1 hour job ended up taking 5 hours, with network impacts taking another hour or two to resolve. A few user routers went crazy, and made everything slow south of Lovettsville until they were reset. But now 2 access points are immune to this type of interruption in the future.

The upset customer router causes arp queues to flush constantly on the gateway routers and many of our radios, slowing things way down. The router did this until we were able to identify and isolate it. Users were juggled between primary and backup access points until as we could narrow the candidate sources. It worked out that the owner of the wayward router called us about the same time we had narrowed the field to 2 choices. They rebooted their router and radio and everything got better. Problems like this will be more isolated as the rest of the access points are changed over. Change-over appears to be a painful process, and we haven't been able to do more that one per week. Over half-way now.

23 Apr 2006– Friday and Saturday were busy upgrade days. We've installed new high speed antennas both in Lucketts and Lovettsville to improve signal reliability and increase capacity of the system. Access Points 2, 4 & 7 were converted to the routed backbone which greatly improves stability of the overall system. A new block of IP addresses was integrated into our system to go with the other routing changes, so you can gracefully expand some more. A soon as rain permits, we will be upgrading AP21, AP13, AP9 amd AP7 antenna systems to take advantage of the new high speed links. New APs will be placed in a few subdivisions to better distribute and balance the system load.

We have a list of about 12 customers with degraded signals, mostly due to leaf growth agrivated by the hard rains. We will be visiting them this next week to move or replace antenna as needed. If you are having intermittent connects, patrticularly associated with rainfall, please give us a call and we'll check to see if you should be on the list.

16 Apr 2006– Spring cleaning I guess. We've been replacing access points to get rid of nagging issues. As the old APs are taken offline, the network gets more stable. It's working pretty well right now. Only one old AP left. Once that's done, we'll start upgrading the routing backbone so that the remote site get a more stable connection.

Lucketts.net now has 2 employees. Many of you have met Laura on the phones, and we now have Daniel. With more people, there will be less waiting for Steve to come and fix things. We may even get ahead and start fixing some things before they break. A few of our customers who are IT professionals have also offered their services for tech support. We'll soon be in a position to offer more of the "geek-guys" or "Computer Squad" type of service where we can come to your home and get paid to explain things like the proper use of the built in cup holder.

We've been trying to offer tech support to the homeowner for things other than the internet connection. Usually when associated with an install this help has been free, as we were there anyway. If time is available and there are no networks problems demanding attention, it's been the Lucketts.net policy to provide free tech support. A few customers have abused this policy. They can to report a network problem that somehow cannot be identified or fixed over the phone. We dispatch a technician to the home who finds that the network is working as it should. The customer then asks the tech to help with a few others problems they've been having, usually stuff like "how do I change the text size onscreen?" or "how do I get a new printer driver installed" or "Can you help me move my furniture to see if it looks better over there?". Now that an extra 30 minutes at one home means that another home's problem is not resolved until the next day, we'll be a little stricter about the amount of free support offered, unless of course pizza is offered. Some customers have already used our premium support, where we schedule the technicians time and charge by the hour. That capability will be expanding soon with the addition of a few new techs that can do house calls. The price is $80/hour plus parts. Minimum time billed for house call is 30 minutes ($40). Satisfaction guaranteed, so if we do not fix your problem, then there is no charge. We will usually be able to schedule a tech within 24 hours of your call, but more often it will be the same day. The tech's time is yours while there, so you can ask any question and expect patient, thourough answers as the tech will not be pressed to leave. We can fix hardware, install software, run wires, provide how-to tutorials or classes, etc. Just let us know what you're looking for so we can send the right person.

10 April 2006– Best laid plans for the day sort of put aside. Started the day with AP16 almost completely un-usable, but the radio reported all was well. The six customers with no access disagreed. After a few hours of debugging and a on-site visit to replace the electronics, we found a few inches of water in a sealed antenna. oops...

The problem with the relay to Lovettsville/Taylorstown was still there this morning, so we replaced it today as well. Should stop the hourly reboots that caused those 45 second drop-outs. Testing tonight to make sure.

Access Point 8 also had a problem pop up this weekend, but we didn't disover it until this AM when a few business users tried to work from home but couldn't. Bandwidth on the AP was reduced to 50% and there were may dropped packetts. We isolated the problem to a single user's machine and it was turned off. Network appears to be at 100% right now. We'll visit the customer site Tuesday to find whats appears to be a high-speed router's problem. Jim, this may fix your problem as well

If anyone is still experiencing slowdowns, please let us know. We'll be replacing 2 more APs tomorrow, trying to work ahead of problems rather that reacting to them.

27 Mar 2006– On Friday, the network started experiencing unexplainable hits to performance, so bad that some client radios shut down. We played detective and doctor all weekend trying to fix those that were down and to find the source of the problem. Problem located Sunday night, and all users known to be directly impacted have been brought back online. If you are still down, you are probably not reading this, but if you were we would tell you to reboot your radio and that should get you online immediately. The problem was traced to a combination or 4 problems with an old router with an old software load that was mis-configured and had trojan software running on the PCs behind it. The combination was bad, but unintentional. If you can, please check your router every now and then to see if there is a firmware update available for it.

24 Mar 2006– There is a recurring problem on Friday evenings. Something causes all of the routers to seize up and it takes about 20 minutes to clear. A few of the older APs require rebooting to get back online. We're upgrading the APs as quickly as possiblet o mitigate problems like this. When we get the network completely segmented, the problem will be isolated to smaller and smaller sections until we can isolate it.

A new intrusion detection system was installed during all-nighters Wed & Thurs, just in time for the weekly Friday storm. It may take awhile to analyse the logs, but there is a fair chance that the source of the storm may be found this time.

20 Mar 2006– Don't everyone go running bandwidth checks at once... .. The new T1 is finally up. Never trust someone else's technician when they say they've got everything covered. I should have walked through it step by step to make sure. The AT&T guys had everything ready for the change-over except there was no gateway address loaded in the new router. When the old lines were moved and the new line was added, they just sat there with no gateway available for the internal network (that's us) to connect through. Of course, I was using a vonnage phone connection at the time (oops) and had to run out and find my cell phone. After contact re-established, we got the proper settings into the router and it looks like it works. We'll be monitoring for the next few days. Let us know if you notice any abnormal problems. OK, let us know if you have any normal problems as well.
Now it's time to order the next bandwidth upgrade.

19 Mar 2006– Another big power outage testing our backup systems. This time at 11PM instead of 11AM. Grade on this 1 was a "D"... maybe a "C-". Primary generator would not start, so we left RACK site 1 on UPS and moved on to generator 2 - the new one. Gen2 was the one we had problems with during the outage earlier this week. Turned out it needed a power conditioner to run our equipment and that was installed on Thursday, so Gen2 was ready to go, or so we thought. It started fine, and we put rack site 2 & the office on AC before their UPS shut down. Then we turned attention back to rack site 1 and gen 1.

After trying again to start gen 1, we cross connected gen 2 to rack site 1 with a few hundred feet of heavy duty extension cord , but noticed that the heavy duty UPS would not work off of gen2. Grabbed a spare power conditoner and put it in-line, but still not able to get the UPS to cycle off of battery, which was starting to run down.

Starting to sweat now as I realized that the same type of heavy UPS was at rack site 2, and after running out to check it found that gen 2 was not letting the heavy UPS there cycle off of battery. So now we have the heavy UPSs running the racks at two sites about to run out of battery. Went back to gen1 and played with it in the cold and finally got it running. Dropped the gen2 crossconnect and powered up rack 1 with AC just in time to see rack 2 go offline.

Ran back to rack 2 and changed out the UPS for one that we had verified would run off of gen 2 and the power conditioner, and got rack 2 back up and running. It was only down for 2 or 3 minutes, but since it has the T1 router, everyone was offline for those few minutes..... sorry. Came back to thehouse to rest for a few minutes and teh power comes back on. So it's back out to the generators to shut them down and change the UPS back over to utility AC power. When I got back to rack 1, I noticed that 1 of the servers had managed to panic during the excitement and shut down. I don't know exactly how long it was down as I had shut off some of the monitor servers to preserve battery power to run the crutial DNS, mail and web servers. It was a hard crash so the file system was corrupted, and it took 3 or 4 reboots before it could repair itself.

Right now it looks like the core systems here are all up and running. Power appears to still be out in a few of the remote locations, as AP16 is off-line as well as AP19. This affects 4 residential and 2 business customers. Good thing it's 0-dark-thirty on a Sat night. I hope the business users take Sunday off. AP16 & 19 should come up on their own when power is restored. If it is still down in the morning, I'll check with the homeowners at a reasonable hour.

18 Mar 2006– Friday we tracked down at least 5 laptops trying to access the network. That normally waould not be a problem but for a few users with misconfigured routers that tried to supply DHCP addresses to the laptops. Caused a real mess on and on all day. Those devices have been locked out now. If you are using a wireless connection inside your house, be sure to select your in-home network instead of the LuckettsWISPxx connection. You will not get through to the internet via direct connect to a LuckettsWISP access point unless we set you up. You may get just far enough to interact with someone elses messed up connection and cause a routing storm on the network that affects many others. We are adjusting the network layout and security schema to prevent this, but it takes alot of time for each piece. The areas already updated were not impacted on Friday.

There was also a problem with persistent connections to the internet being broken. VPNs, video links, remote connections, etc were impacted on whenever traffic exceeded 75% capability. Tracked the problem to the firewall controlling P2P download bandwidth. While trying to balance the available bandwidth fairly between users, the firewall was occasionally treating the VPNs and video connections like they were P2P traffic and giving them a very low priority, causing occasional interrupts. The specific source of the problem was found while I was off-site trying to VPN into the lucketts.net servers and I kept getting interrupted. See, it's good if I leave the area occasionally...

2 changes in the short term to fix this: a new T1 was just lit up on Friday, and its bandwidth should be available online early next week. We've also shut off the protocol prioritization rules on the firewall that have been working to balance loads. Instead we will re-activate hard limits on users that use download patterns that take an unfair % of the available bandwidth. Before you get too upset, the bandwidth caps will start at 1 Mbps, which is the speed you actually signed up for. Heavy downloaders/uploaders will not be able to take advantage of any available bandwidth above 1 Mbps, but everyone else will. If congestion reaches the point where normal users cannot get at least to 1 Mbps, then the heavy upload/download users will be restricted further to ensure that bandwidth is available to the bulk of users.

The long term solution will use similar bandwidth control rules, but will be automated and dynamic. The only hurdle is the time required to write the code. I still have a position open for a 1/2 time to full time installer/programmer/webdesigner/network tech/customer support person that would like to get in on a new business and grow with us.

13 Mar 2006– Power was off today for about 90 minutes, so it was a good test for things to come. We found that one of our generators had a voltage output that was not within the range acceptable to our equipment, so we had to scramble a bit this morning. We ordered 3 new power conditioners to take care of this, so we should be ready for the next outage. Other than momentary hits, everything stayed up for at least 90 minutes. One site was shutting down just as power was restored, so that's the next site to get an UPS upgrade.

10 Mar 2006– In an earlier post I stated that Loudoun wireless was one of the WISPs that blocked P2P traffic. John, the owner of Loudoun Wireless informed me I was mistaken. His statement: "We do not forbid P2P. In fact, we encourage it. We do limit it for the same reasons you have stated in your web site." I have corrected my posts to correct the record.
For those that care, Lucketts.net currently prioritized P2P traffic lower than normal traffic, which usually manages the network nicely so that normal web traffic is not overwhelmed by the P2P, yet P2P is still usable and fast. In some cases where a P2P users downloads more than 10 GB in one day, we do put download speed caps on their account. The speed caps are currently 1 Mbps, versus the normal peak download speeds of 3 to 4 Mbps. On a few occasioins, the P2P users has been capped lower when their traffic has adversly afffected the network.

8 Mar 2006– I can expand on the open position a little more: We're looking for someone that can start part time with the intent to transition to full time over the next few months. Skills needed or to be developed on the job include:

network mgmt / routing / BGP /
unix (FreeBSD), Win XP, Win 98, Mac OS
firewalls / Intrusion detection / monitoring
mail and various other servers
some web design & site maint
physical installations / rooftop work
some construction (we're building the new office/network center from scratch.)
On-site customer support

We can start at 20 hours per week and ramp up to full time pretty quickly as the business grows. I expect this to be an entry level position for an IT graduate or an accomplished geek.

3 Mar 2006– AP6 was replaced today. The rebooting was getting worse and worse so we could not wait until the weekend. Sorry for knocking everyone offline Friday afternoon. There were three or four 10 minute outages, but the new AP is happy. Another new router from AT&T showed up Thursday so that we can get another T1 installed. They say 17 March, but it may be possible to get it online sooner.

There is an employment opportunity for someone that has basic PC/networking skills and can work on roofs. We will provice training on job. Position would start as part time and expand to full time. Good chance to learn a skill and grow with the new business. Give us a call if interested.

26 Feb 2006– After AP8 died on Thursday, I thought we'd get a chance to rest and re-group on Friday. But, it looks like AP7 (Spinks Ferry) was stressed by the events of the day and it died in its sleep Thursday night. Friday morning was spent on the rooftop in 30 mph winds replacing the unit. For this I gave up my beltway job? AP7 was replaced with the same gear. I would have rather put one of the new APs there, but the network wasn't ready yet. I had planned on at least one more week of life for AP7. The next AP to be replaced is either AP5, AP 7 or AP9. Let's hope none of these die and we get to replace one pro-actively rather than re-actively. - Steve 4:46PM

23 Feb 2006– During a maintenance stop on the access points to fix nagging problems, some equipment died while being maintained. All is in the process of being replaced and back-ups have been deployed. You should return to full speed later tonight. We apologize for the inconvenience.

18 Feb 2006– Another busy few days. There is a persistant problem in Taylorstown, slowing some customers down to speeds around 1 Mbps, and causing breaks in connections fairly often. While the 1 Mbps is not nearly as fast as the 3 to 4 Mbps peak speeds we usually provide, it still matches the max speed the other wireless providers provide. We have a complete suite of replacement equipment and will be working on the AP most of Thursday. Radios, cables, power supplies and antenna will be replaced. This should fix the sluggishness and connection interruptions. The nagging problems with customer routers continue, but their impact will be less and less as the network architecture is migrated to the new design.

12 Feb 2006– P2P software is the topic again today. Our goal is to balance/tune the network to accomodate the P2P users without impacting everyone else. But, it's not working. After 2 hours of fine balancing tonight (a very high volume day) trouble calls were still coming in about abnormaly slow performance. This is after serious limits on P2P were already in place. So here's the plan for tonight: standard P2P ports are blocked. The users doing P2P on non-standard ports will have bandwidth caps places on their IP addresses. The users with abnormally high connection traffic will have connection caps placed on their link. Later tonight I will evaluate the network load and ease the restrictions if warented, probably after 11PM.

There were 5 users online with P2P downloads running wide open, and many more running limited P2P. Those 5 were significantly impacting the download speeds of all of the other 175 customers. 5 or 6 other were using the P2P responsibly, and I'm sorry they are caught up in the limits tonight. I'll discuss the technical details of P2P limitations in the lucketts.net forum under the announcements section for those that are interested. P2P is causing at least 1/2 of my problems, and I'd like at least one quite night.

On a side note: AP 19 was replaced with an upgraded access point. AP 16 was replaced with an upgraded AP. AP 24 is installed and operational now. AP 10 has a new high speed link installed, but not tuned. AP5 was worked on this week, and will be replaced with a new unit currently on the test bench. AP 9 will also be replaced with a new unit currently being tested. A new AP will be installed in Glynn Tarra to take care of that neighborhood as they have outgrown the original access point. TaylorsValley will also get a new AP. All this in the next 2 weeks - weather permitting.

5 Feb 2006– Everyone should be up and running, no service kown to be down now. There are a few known problems we're working on that are causing occasional glitches:

AP10 (Stumptown) still has problems when it rains. Hardware for a new link is in place and will be tested on the first calm, dry day. Involves raising a 50 ft mast. Until then, performance will drop on this AP when it rains.

AP8 (Taylorstown, Lovettsville) Has a power issue causing a reset of the AP once or twice a day. The AP comes back up automatically after 30 seconds. We're replacing a different part each 48 hours to isolate the problem.

AP19 (Furnace Mountain/Rt15) This AP is still not considered operational. Users connected to this AP during test phase are not being charged. The current problem is that we have extended the network beyond what is stable. The entire radio system between the network center and AP19 may need to be upgraded before it stabilizes. AP 19 was replaced 10 days ago, Backbone receiver at AP 19 was replaced 2 Feb. AP19 has responded well to these changes, but is still not up to speed. In Feb, AP16 (the next link up from AP19) will be upgraded, and then the link from AP16 back to the network center will be upgraded in March.

2 Feb 2006– Quick note.... the new T1 is working like a champ now. It works so well that many of the P2P users have turned their P2P back on during the day. Over the past 6 months the network has been upgraded substantially, but it is being strained during peak times by the P2P. Some wireless networks just forbid P2P, while at least one local wireless provider encourages it.. I'm trying to be flexible, but when 10 residential users complain of broken connections that clear up when I shut down a P2P users, I'm tempted to follow the example of the other providers. I've ordered another T1 already, so we will keep ahead of bandwidth problems, but we already have more bandwidth any single access point can handle. One P2P users with 100's of concurrent data streams will still tie up an access point so badly that the other users has serviously degraded service. If the P2P users cannot control themselves, I will put serious caps on P2P use.
[This post edited 10 March 2006 at the request of Loudoun Wireless. I errored when I included them in a sweeping characterization]

Quick note.... the new T1 is working like a champ now. It works so well that many of the P2P users have turned their P2P back on during the day. Over the past 6 months the network has been upgraded substantially, but it is being strained during peak times by the P2P. Some wireless networks just forbid P2P, while at least one local wireless provider encourages it. I'm trying to be flexible, but when 10 residential users complain of broken connections that clear up when I shut down a P2P users, I'm tempted to follow the example of the other providers. [This post edited 10 March 2006 at the request of Loudoun Wireless. I errored when I included them in a sweeping characterization]

23 Jan 2006– Wind storm appears to have blown out of alignment the antenna at the Spinks Ferry/Evans Pond intersection. We'll be out fixing it as soon as the frost gets off the roof. The certificate used to establish secure connections to the mail servers in now one year old and needs to be replaced. We should have set the age limit to 2 or 5 years, but instead we used the default of one year. As a result, a new certificate has to be created, installed on the servers, and distributed to all of the users. The new certificat will be emailed to everyone, and it will also be psoted on the web site. Instructions for installation will be included. ETA will be this weekend. Until then, if you are using secure POP3S and/or SMTPS everything is still secure. You may get a warning from your mail client telling you the certificate is out of date. If that happens, select the option to use the certificate anyway for now. If you use Outlook you will have to do that each time you open the program.

23 Jan 2006– Looks like the difficult to pin down problems Sunday have been found. All afternoon and evening we received reports of slow traffic even when bandwidth was available. After checking and rechecking our network we found a few small problems, but those couldn't explain the observed problems. We finally did an external test on the T1s around 11PM and found that 1 was down and another had poor quality. AT&T worked with us through the night to troubleshoot, but around 6AM the 2nd T1 failed completely. It was pretty grim here as the morning traffic was just starting up and we only had 1 working T1.

AT&T and Verizon worked the rest of the day locating the problems. Turns out that one of the long copper runs was impacted by moisture (a Verizon line). We switched that circuit to a backup cable on our property and it came right back up. The second T1 had problems in the Verizon fiber down the road and the took the rest of the afternoon to get it repaired. Everything should be up to speed now. Let us know if you still are having any problems.

18 Jan 2006– Sorry for the unplanned outage for 1/2 of Taylorstown at 11PM Thursday night. I was testing the new backup Access Point to make sure it could pick up the load if one of the main APs failed. The backup failed in mid test, letting out an RF scream that knocked out one of the main APs and most nearby clients. All freqs were jammed for about 5 minutes. I was able to finally shut down the bad AP and get the main back online. That wasn't supposed to happen, but it was exciting here for a few minutes. I'll be replacing the backup Access Point Friday morning.

17 Jan 2006– The new T1 is working today! This will help greatly with recent congestion issues. The order is already in-work for the next T1. We'll try not to let it get ahead of us this time. The bad router that was seriously impacting the network on Sunday/Monday has been isolated. We'll be working with the homeowner to prevent recurrance of the problem. The transmitter for the backup backbone link to Taylorstown is installed. Still some work to do there before it is operational, but just having it in place as a ready backup is nice. The Temple Hall (#22) access point is up and being tested now. We'll be adding customers to it soon. Laura gets paid this week, so she is happy. Our new installer, Kyle, will be starting to pick up installations this week, freeing Steve to focus more on network issues.

13 Jan 2006– This patch is from Microsoft and fixes their most recent security problems with trojans/viruses in images. Highly recommended.

3 Jan 2006– There is a known security threat floating around the internet right now, known generically as the WMF vulnerability. There have been many newspaper articles about this, and a lot of hype, and a lot of mis-information. Here's what you should know: Windows and most Microsoft code have a bug that allows bad guys to hide viruses in images. If you view the image in ANY Microsoft application, it will attempt to infect your PC. The threat includes email programs and web browsers. Microsoft is aware of the problem and has released some preliminary patches in Win XP to help mitigate the problem, but their patches only partially solve the problem. Other MS programs are not fixed at all by their patch. MS plans on releasing a real fix on 10 Jan, but between now and then web browsing and email are much riskier than normal. Anti-virus companies (such as Norton and McAfee) are releasing updates to their programs almost daily, but the variations on the threat are growing as fast as the companies can update their software. You will be partially protected by them, but still at some risk.

To mitigate the risk yourself, there are two additional steps you can take. First, you can unregister the system DLL (shimgvw.dll). Doing this will help, but unfortunatelly other programs on your PC can re-register the DLL without your knowledge. If you are happy playing with the guts of your operating system, you should unregister "shimgvw.dll". If you need help, we will not walk you through it on the phone. Playing with the inner workings of your operating system is much too risky to do over the phone and requires a billable service call if you want us to do it for you. The second action you can take on your own is to save this patch to your desktop and execute it. Follow the instructions and you "should" be protected from all varients of the problem. This patch is not a Microsoft patch, but it has been vetted by the SANS institute and they reccommend it. SANS is "the" credible source on PC security matters. This web page is the SANS page describing the problem.

Old 2005 news is archived here: Old News for 2004