Sunday, January 28, 2024

When AWS Ruins Date Night

Full disclaimer. This will probably be pretty dry, boring, and maybe overly technical. But...it provides some insight as to what I do for work, and for people who work in a similar field, it might be relatable to problems that you deal with. And besides, the primary purpose is to document events for myself. If other people enjoy reading about them, great.

For the last seven years, I have worked for a company called 365 Retail Markets. We provide hardware and software for micro-markets. I have spent most of my time with the company working in the vending division. My current title is Manager of Vending Technology. I head up a small team that is responsible for three different card reader devices, a website for managing these devices, and an api that allows the devices to communicate to our backend. I primarily work from home, from my little office in our basement.

It was Friday and the work day was winding down. I had plans to go see a movie with Jeanell and a few of my kids. (We were going to see Mean Girls Musical Movie. First it was a movie. Then it was a musical. Now it's a musical movie. And Tina Fey has laughed all the way to the bank every time). It had just passed 5:00 and I stepped away from my computer. We weren't planning to leave for the movie until 6:15 or so, so I thought I might go relax for a bit, maybe read a bit of my book.

Suddenly, I got an alert on my phone saying that our database server had gone down (our backend consists of two server instances in AWS, one which hosts our database, and the other that hosts our web applications). I headed back down to my computer and when I couldn't login to the database server, I pinged Rob, a co-worker who manages our AWS infrastructure. He is in the Eastern time zone so it was just after 7:00 for him, but he said he was logging back into his laptop.

While we were getting logged into the AWS console to see if we could determine the problem and get the server backup, the server suddenly came back online. Rob looked in the console and there was a notification that our server instance had failed a status check and since it was set for auto-recovery, AWS had automatically created a new server instance. It looked like we were good.

I use software called New Relic to monitor the performance of our websites and I checked it to make sure that things were going back to normal. It quickly became apparent that while things were back up, they were most certainly not back to normal. The average web request response time was much higher than it had been and our APDEX score (a measure of what percentage of web requests receive a response in an acceptable amount of time) was about half of what it had been. I logged into the website and found it to be running extremely slowly. We weren't out of the woods yet.

At 6:00, with problems still ongoing, I texted Jeanell and told her I probably wouldn't be able to go to the movie. I was still hopeful that maybe things would start working again in the next few minutes, but it wasn't looking likely. She came down to my office, looking gorgeous. I couldn't believe I wasn't going to be able to go.

I pinged my boss Chad and our database administrator Kris and made them aware of the situation. They are both in the Eastern time zone as well, but both willingly jumped on a call, along with Rob and me. We were trying to get things going, but nothing was working. Jeanell was still waiting for me, but time was up, I wasn't going to be able to go. Jeanell and Lila left to go to the movie.

We spent the next 2.5 hours looking at logs, monitors, and doing everything we could think of to get the performance back to normal. We rebooted both servers. We tried to launch yet another new instance of our database server. Nothing seemed to work. Finally, we let Rob and Kris step away and Chad and I reached out and arranged to get on a call with AWS Support.

With the AWS support engineer, we again looked at various logs and performed various queries against our database, trying to figure out where the problem was. We were on this call for another hour without finding anything. Finally, he said he would send us some instructions of some logs to get that we could upload for another team to review. We ended the call.

We waited for the email with instructions on the logs that they needed. By this time, it was 10:00 my time. We wanted to get the logs before we stepped away so someone could be reviewing them while we slept. Chad pinged Rob and Kris for help with getting the logs we needed and even though it was now midnight EST, Rob responded and said he could get the logs.

That process took about a half-hour, but we got a zip file of the logs and were able to upload it to AWS for their review. At this point, we stepped away. I let Chad know that I would be up at 6:00 the next morning and if there was something for us to do, we could start working on it.

Jeanell and Lila returned home and we went to bed. I happened to wake up at 3 AM and AWS had responded with some additional suggestions. They suggested initializing one of our hard drives and updating some drivers. I went back to bed and got up at 6:00. We needed Rob's help with the additional steps and understandably he wasn't up yet. We pinged him and by 7:30 we were back on a call to proceed working on the problem.

But the suggestions didn't really make sense to Rob or any of us. We went back and forth for a bit, but finally decided to request another call with AWS. Finally, we got an engineer who had some insight into what was really going on, due to a similar case he had worked on a couple of days before.

He noted that our database server instance type was m4 and said that he had seen some newly created m4 instance types having performance problems. We had in effect created two new m4 instances, one automatically by AWS and then we had tried the process again manually. Both had resulted in the performance problems.

He suggested upgrading our instance to an m5 instance type and thought that would resolve the problem. We stopped our websites, took a backup of the current server, updated some AWS drivers to later versions that were needed for the m5 instance type, and then performed an in-place upgrade of our database server. After completing the upgrade, we started the database server back up, started the websites again, and held our breath.

I watched New Relic and breathed a sigh of relief when the performance metrics went back to what they had been before the initial crash. Chad logged into the website and found it to again be fast and responsive. It looked like this had solved the problem. It was now 9:30 AM Saturday morning.

Fortunately, this kind of thing doesn't happen too often, but it was still frustrating to miss my date with Jeanell and the kids due to something outside of my control.

Either way, it felt good to finally have the problem solved.



Sunday, January 21, 2024

The Stop-and-Go Sign

I moved to Grantsville, Utah in the summer of 1983. I had completed kindergarten in Kaysville, Utah, where I was born. That fall I started first grade at Grantsville Elementary School in the class of Mrs. Jan Baird. Since this was forty years ago, my memories of first grade are not entirely clear. I remember learning songs from Mary Poppins and participating in the First Grade Circus (where I was a Hippity-Hop and also took part in the Purple-People-Eaters number). I'm sure there was some reading, spelling, and math as well. But there is one memory from first grade that I still remember pretty clearly.

When we'd eat lunch in the cafeteria, the teacher on lunch duty would carry around a Stop-and-Go sign. It was literally two pieces of construction paper (one red, one green) cut into circles and glued back-to-back with a popsicle stick handle in between. The words "Stop" and "Go" were printed on the corresponding sides. The sign would be used to indicate if a particular table was excused to go out for post-lunch recess. I don't remember exactly how this worked. There wasn't a sign for each table, so maybe it was just that when a table was excused to go out, the teacher would go to that table and dramatically turn the sign from "Stop" to "Go." In any case, it was used in some fashion to excuse kids to go outside after they had finished eating lunch.

My best friend in the first grade was David Fawson (still one of my very good friends who I can always count on to show up to the Alumni Basketball Tournament). One day after school, on our way to wait for the bus, we stopped by the Lost and Found, which was located in the cafeteria. I believe one of us was missing something and stopped to see if it had been placed in the Lost and Found. We did not find the missing item, but sitting atop the assorted items, was the aforementioned Stop-and-Go sign.

At the time, I think I believed it had been misplaced or lost and that is why it was in the Lost and Found. I later realized that's probably just where they kept it. For reasons I no longer remember or understand, and in an act that for me, qualified as rebellion, and against Dave's protestations, I decided to take the sign.

It's comical to me know, but I can still recall Dave's disbelief that I would be so brazen as to just take the Stop-and-Go sign. I mean, how would we be dismissed for recess the next day?

The consequences that would come to the school as a result of my larceny notwithstanding, why did I even I want the sign anyway? What was I going to do with it? It's appeal as a source of entertainment seems limited. If it had been awarded to me or given to me as a gift, pretty sure I would have thrown it straight in the trash. But since I was stealing it, I apparently wanted it desperately. I grabbed it from the Lost and Found and headed out to wait for the bus.

In what was maybe an indication of how I would fare as a criminal, I was not discreet about my theft. Instead, I proudly displayed my treasure to several others at the bus stop. When the bus arrived, I got on the bus and took the sign home. I have no memory of doing anything with the sign once I got there.

When I came back to school the next day, I left the sign at home. (I guess I had the good sense not to return the item to the scene of the crime, or something). Not long after the first bell rang and the school day had begun, Mrs. Karen Johnson (the school secretary at the time) came over the intercom: "Students, the Stop-and-Go sign is missing. If anyone has seen it, please let us know."

Never one who liked to be in trouble, I was immediately in a panic, but for a few moments, thought I could get a way with it. I would just act casual, bring it back the next day, take it to the office, and claim I had found it somewhere.

With a swiftness that astounds me to this day, this fleeting hope was shattered. It seemed that the crackling of the intercom from Mrs. Johnson's initial announcement had scarcely gone silent when she came over the intercom again, this time only to Mrs. Baird's classroom: "Mrs. Baird, I've received several reports that Richard (I was know as Richard in those days) Mouritsen was seen with the sign." (Several?! Was this school chock full of narcs?!).

It seemed that every head in the classroom turned toward me with looks of disgust. Mrs. Baird looked at me disappointedly and asked sternly if I had the sign with me. I replied that it was at home. "Well, you better go to the office and call about it."

I slowly walked to the office. I entered, and Mrs. Johnson glared at me with maybe even a bit more sternness than Mrs. Baird had. "Are you here to call about THAT?" she asked icily, without a hint of a smile. "Yes," was my meek reply.

When I think about now, I still laugh at the gravity that everyone gave the crime. We're literally talking about two pieces of construction paper, a popsicle stick, and a bit of Elmer's Glue. You'd have thought I had vandalized the school and taken some precious historical artifact with the way people were acting about it. Did we really need to get it back right then? Could we have survived the day and I would have brought it back tomorrow? Couldn't the teacher on lunch duty just say, "You're excused to go outside." Just for one day?

I dialed my phone number (4-6967, in those days, we didn't even have to dial "884", just "4", although I honestly don't remember if from the school you had to dial "9" first to get an outside line. This was 40 years ago. I was in the first grade!).

I think about my mom, and what the situation at home would have been like at the time. I don't recall if this was in the fall or the spring, but I guess either way, my mom would have had three young boys at home. Carl would have been a little over or under a year, Alan about three, and Scott five, I don't know if my mom had a car at home or if we only had one car at the time, which my dad would have taken to work (at Zion's Bank in Tooele, which was at the corner of what is now Utah Avenue and Main, though I don't know if it was called Utah Avenue at the time. There's certainly another story there. Maybe another time). I believe at some point we received an old blue car (I don't recall the make or model) from my Grandma Nalder, but I don't know if we had yet. (All I remember about that car is that it didn't run real well and was certainly not winning any beauty contests).

I don't remember exactly how the conversation with my mom went, but do remember that she was not about to drop everything and bring the sign to the school. I told Mrs. Johnson I'd have to bring the sign back the next day.

And to the best I can recall, I did. And thus ended that chapter of my troubled criminal past.

Attempting to Blog Again

One of my resolutions for the new year was to "Write a blog post (or equivalent) each week." While other of my resolutions I have kept reasonably well, I have made no real effort at writing a blog post (or equivalent, whatever that means). Today I am attempting to change that.

Part of the problem has been trying to figure out things to write about. My take on current events, politics, local issues? While I certainly have my opinions on many issues, my past efforts in that area haven't typically generated a lot of positive discussion, and in general, I find wading into controversial topics to be exhausting. There may still be times I decide to put an opinion out there on a controversial topic, but for the most part I am content to keep my opinions to myself, unless specifically asked.

So what else of value do I have to write about?

I work out consistently, but don't really consider myself an expert. Maybe from time to time I'll write about that topic. I'm a software developer by profession. I'm not what I would call a cutting-edge developer, but maybe at times I'll write about my job. I enjoy reading, and try to read a variety of almost exclusively non-fiction books. I'm a big fan of stand-up comedy and musical theatre, so maybe at times I could offer opinions on those topics. While I've dedicated less time to it recently, I enjoy researching family history so at times I might write about the experiences of my and/or Jeanell's ancestors.

But since I've never consistently kept a journal (other than the two years I served a mission for the LDS Church), I think I will try to mostly write stories from my life. I don't know that my life has been particularly interesting, so writing about it will probably mostly be for my own benefit. But I feel like I'd like to write some things down so maybe at some point, if someone takes an interest in who I am or what I was about, they might stumble on this blog, and find somewhat of an answer.