Thursday, May 20, 2021

Never Start a Goroutine You Can't Finish

The Go programming language has a pair of features that work well together for assembling a sequence of steps into a pipeline: goroutines and channels. In order to use these successfully, the goroutine needs to listen for communication it expects and watch for communication that might happen. We expect a channel to feed the goroutine items and be closed when no more items are forthcoming. Meanwhile, we need to watch a Context in case the goroutine should exit before the channel is closed. This article will focus on handling some of the edge cases that will keep your goroutines from finishing.
There are some fundamental things you should understand before using channels and goroutines. Start by completing the section of the tour of concurrency. With an understanding of the fundamentals, we can explore the details of goroutines communicating over a channel. The goroutine functions will each be responsible for detecting when it's time to finish: It is important to check in with your context regularly to see if it is "Done." A closed channel will give a waiting receiver the zero value. Ranging on channel loops until that channel is closed. Let's take a close look at two of the more common approaches I've seen. There's a lot to learn by looking at the trade-offs between these two approaches.

Infinite for loop

func ForInfinity(ctx context.Context, inputChan chan string) func() error {
	return func() error {
		for {
			select {
			case input := <-inputChan:
				if len(input) == 0 {
					return ctx.Err()
				}
				fmt.Println("logic to handle input ", input)
			case <-ctx.Done():
				return ctx.Err()
			}
		}
	}
}
  • When the inputChan channel is closed, you have to look for the zero value on the channel. The logic of that select case will need to detect the zero value to finish the goroutine – perhaps with a return nil or return ctx.Err().
  • When the context done case is selected, it feels natural to return ctx.Err(). By doing so, the goroutine is reporting the underlying condition that caused it to finish. Depending on the type of context and its state, the ctx.Err() may be nil.
  • If more than one select case is ready then one will be chosen at random. Given the undefined nature of having both of these case statements ready, you might consider having the zero-value-detecting logic return ctx.Err(). This will ensure your goroutine returns as accurately as possible, even if the channel case was selected.


Range on a channel

func ForRangeChannel(ctx context.Context, inputChan chan string) func() error {
  return func() error {
    for input := range inputChan { 
      select {
      case <-ctx.Done():
        return ctx.Err()
      default:
        fmt.Println("logic to handle input ", input)
      }
    }
    return nil
  }
}
  • While the goroutine is waiting to receive on inputChan, it will not exit unless the channel is closed. Now our pipeline func is dependent on the channel close. If the Context is “Done,” we won't know it until an item is received from the range inputChan. Upstream pipeline functions should close their stream when finishing.
  • Range won't give us the zero-value-infinite-loop, as in the earlier example. The Range will drop out to our final return nil when the channel is closed.
  • The context Done case has the same impact here as it did in the earlier example. The difference here is that the Done context will not be discovered until the channel receive occurs — making it even more important that the channels are closed.

Be mindful of the flow inside your goroutine to ensure it finishes appropriately. That and lots of tests will ensure your goroutines under normal and exceptional scenarios. Here are a couple of tests to get you started. These are written to exercise the same scenarios for each of the above goroutines.

func TestForInfinity(t *testing.T) {
	t.Run("context is canceled", func(t *testing.T) {
		inputChan := make(chan string)

		ctx, cancel := context.WithCancel(context.Background())
		cancel()
		f := ForInfinity(ctx, inputChan)
		err := f()

		assert.EqualError(t, err, "context canceled")
	})
	t.Run("closed channel returns without processing", func(t *testing.T) {
		inputChan := make(chan string)
		close(inputChan)

		ctx := context.Background()
		f := ForInfinity(ctx, inputChan)
		err := f()

		assert.NoError(t, err, "closed chanel return nil from ctx.Err()")
	})
}
func TestForRangeChannel(t *testing.T) {
	t.Run("context is canceled", func(t *testing.T) {
		inputChan := make(chan string)

		ctx, cancel := context.WithCancel(context.Background())
		cancel()
		f := ForRangeChannel(ctx, inputChan)
		go func() {
			//this test will hang without this goroutine
			<-time.After(time.Second)
			inputChan <- "some value"
		}()
		err := f()

		assert.EqualError(t, err, "context canceled")
	})
	t.Run("closed channel returns without processing", func(t *testing.T) {
		inputChan := make(chan string)
		close(inputChan)

		f := ForRangeChannel(context.Background(), inputChan)
		err := f()

		assert.NoError(t, err, "note there is no need to cancel the context, 'range' ends for us")
	})
}


Summary

By understanding how channels interact with range and select you can ensure your goroutine exits when it should. We have used both examples above successfully. Each different design has trade-offs. No matter the logic flow in your goroutines, always ask yourself: “how will this exit?” Then test it.

I think it's time for another Go Proverb: Never start a goroutine you can't finish!


Tuesday, January 19, 2021

Golang Method Receivers

There are different ways for a method to be assigned to a struct. One of the first thing Golang developers learn is that your method can have a value receiver or a pointer receiver.

Notice below that foo's bar method has a value receiver and accepts zero arguments. Standard stuff. Where it gets interesting is how I call foo.bar in the example main method.

package main

import (
 "fmt"
)

type foo struct {
 msg string
}

func (f foo) bar() string {
 return f.msg
}

func main() {
 receiver := foo{msg: "the value"}
 result := foo.bar(receiver)

 fmt.Println(result)
}

Friday, June 5, 2020

Golang Shutdown Flow



While doing some enhancements to the Golang microservices at work I came across quite a few calls to logrus.Fatal late in the execution of the service. Some of these particular services are long running processes that consume from Kafka and write to GCP Spanner. The problem with logrus.Fatal when called late in these services lifecycle is that Fatal internally calls os.Exit(1). So let’s examine why this is bad for the system in which the service is running.

On the surface I wonder if a non-recoverable error encountered late in the process is “Fatal”. The interpretation of what “Fatal” means is subjective. At the beginning of the process when reading the configuration file, making external system connections -- this is a Fatal problem because the service can even get started. Some precondition failure -- yeah, that’s Fatal. But if a service has been happily consuming messages and writing transformed data to a database only to encounter a non-recoverable error -- is that “Fatal”?

But that’s not what I wanted to show you. What I wanted to show you is why logrus.Fatal(...) is the wrong way to shutdown a service that has encountered a fatal error.

First some basics: https://play.golang.org/p/b8CAlmiVZPH

package main

import (
"errors"
"fmt"
)

func main() {
defer func() {
if err := recover(); err != nil {
fmt.Println("chance to recover:", err)
panic(err)
} else {
fmt.Println("nothing to recover")
}
}()

fmt.Println("Hello, playground")

panic(errors.New("It's a perfect time to panic! -- Woody"))
}


This is basic Golang: the app panics, the defer will execute, recover() will consume/catch the error, and then we re-panic (just as a naive solution). The panic is reported, and the application returns a non-zero status code.

But what about os.Exit(1)? We know that logrus calls os.Exit(int). We need to be aware of the impact upon our defer statements when we use os.Exit(int). As noted in the godoc, os.Exit(int) does not take time to run defer functions. It's just going to shutdown: https://play.golang.org/p/fB8GnFxEATs
package main
import (
"fmt"
"os"
func main() {
defer func() {
if err := recover(); err != nil {
fmt.Println("chance to recover:", err)
} else {
fmt.Println("nothing to recover")
}
}()
fmt.Println("Hello, playground")
os.Exit(0)
}

Here the defer does not run. If you were going to gracefully release the connections to Kafka, Spanner, or any other external resource -- that did not happen. For illustration sake I also have this example returning the success zero status code.

There is another way to exit a Golang application. Well, sort of: runtime.Goexit(). As noted in the godoc all registered defer will be executed.  However, it only exits from one goroutine.  So if this is called in your last goroutine, your service will crash -- in the same fashion as when all goroutines are blocked causing deadlock: https://play.golang.org/p/z0k56ZMoF8N
package main

import (
"log"
"runtime"
)

func main() {
defer func() {
if err := recover(); err != nil {
log.Println("chance to recover:", err)
} else {
log.Println("nothing to recover")
}
}()

log.Println("Hello, playground")

runtime.Goexit()
}

Where does that leave us? Well, it's important to understand the impact of your libraries upon the flow of execution. The logrus.Fatal(...) method is definitely useful. Use it with the full understanding of what it is doing. Use it during service initialization before any defer functions have been registered. Use it when you know you want defer statements to be skipped.

Bonus:


It is important when fixing these sorts of problems with service shutdown to recognize the significance of your services exit code. Your services exit code is its last communication with the software architecture -- it's dying breath used to wheeze out one little death rattle.  Are you running your services in Docker? Kubernetes? 

The service exit code is going to communicate to the container if it exited successfully or crashed with an error. In those situations where you were calling logrus.Fatal, you likely do not want to log the error and simply return. That would have your service return zero as exit code communicating success to the software architecture system. Make sure you take into account the pod and the configured restartPolicy. If your service is shutting down because of a non-recoverable error you likely want to wheeze out a death rattle of non-zero.

Wednesday, February 15, 2017

Password Reset Flow With Native Android

The Problem Scenario

I want a native Android application that will pull the user back after they have used their mail/SMS client to continue the password reset flow:

  1. using the native app, our user requests password reset via email or SMS
  2. the user presses link in their email or SMS client
  3. the user opens the native app to complete password reset
We write native applications for a richer user experience. For all of the valid security reasons the password reset must be out of phase. By pulling the user back into the native app to complete the password reset flow we can guide them to the native experience that we want to provide. (that we have spent money and time creating)


The scenario is defined by the following two gherkin test cases.
Given the user has submitted a password reset
And the system has emailed the long url with the website's address
When the user presses on the link in their phone's mail client
Then the native android app is offered to handle the URL
And when the user chooses the native android app they are taken directly to the screen to save a new password
Given the user has submitted a password reset
And the system has sent an SMS with a bit.ly shortened URL
When the user presses on the link in their phone's SMS client
Then the native android app should be offered to handle the URL
And when the user chooses the native android app they are taken directly to the screen to save a new password

The Android Activity Registration

In Android we can provide one Activity that handles completing the password reset flow. That Activity needs the appropriate intent filters so the operating system knows it can handle the long and the short URLs:

<activity
    android:name=".authentication.CompleteResetPasswordActivity">
    <intent-filter>
        <action android:name="android.intent.action.VIEW"/>
        <category android:name="android.intent.category.DEFAULT"/>
        <category android:name="android.intent.category.BROWSABLE"/>
        <data android:scheme="https" android:host="${host}" android:pathPrefix="/passwordreset"/>
    </intent-filter>
    <intent-filter>
        <action android:name="android.intent.action.VIEW"/>
        <category android:name="android.intent.category.DEFAULT"/>
        <category android:name="android.intent.category.BROWSABLE"/>
        <data android:scheme="https" android:host="m.my.bitly.domain"/>
    </intent-filter>
</activity>

Did you see that "${host}". That is resolved in build.gradle via:
....
buildTypes {
   debut {
      ...
      manifestPlacehodlers = [host:"qa.mydomain"]
   }
   release {
      ...
      manifestPlacehodlers = [host:"www.mydomain"]
   }
...

The Code

Android delivers the URL pressed by the user via the getIntent().getData() method. Which is an android.net.Uri instance. Play with it. Massage it. Turn it into whatever you want. For the url shortened Uri you will of course have to resolve that thing into the full URI. Perhaps you will be using either the REST or Android bi.ly API -- https://dev.bitly.com/.

You will notice that your CompletePasswordResetActivity is launched as the root of it's Task that has affinity to the email or messaging app. Tasks are tricky in Android. These aren't things I've had to deal with much in the past. But tasks and task affinity are things you will need to understand if you want to provide this type of user experience. But that's all I'm going to say about that here. Dealing with the task and getting the user back into the "main task" is worthy of it's own post.

Quality Assurance

There are a plethora of scenarios to test. For the email flow, does the user use gmail, outlook-web, some other web client, some other native mail client? For SMS you have each carrier's custom messaging client as well as Hangouts and other options from Google. I have so far tested with:

  • my Project Fi Nexus 6. I am stuck with Hangouts.  It does NOT work. Hangouts launches the link into it's own internal browser.
  • a Verizon Motorolla Droid.  That phone has Messaging, Messaging+ (Verizon's offering), and Hangouts. The above solution works as expected in all three SMS clients. We can open Gmail and press the full URL link as well.

So.... More testing is needed. Samsung has a wide following for sure. So I'll test on a few flavors of those.

Friday, September 16, 2016

Android Font Settings To Enable Font Variants

Today I learned that fonts often have settings to enable alternate representations of particular characters. For example Gotham is not a monospaced font.  However, if you enabled the "tnum" setting for your Android TextView, then the font will render as monospaced. That is cool!

It appears Android is supporting a W3 standard with this feature. The documentation has a link that references CSS Fonts. Furthermore, this method was added as part of API 21. So unfortunately your users on older API will not see the awesome column layout you can produce.

Android TextView Documentation

In code this would look something like this:
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {    grandTotal.setFontFeatureSettings("tnum");}
or perhaps you are targetting API 21 and can apply the XML setting so that any string put into the field uses the "tnum" or other settings:

<TextView   
    android
:id="@+id/grand_total"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:fontFeatureSettings="tnum"
    tools:text="$30.35"/>

Or perhaps you are targeting API 19 and your TextView has a style set that you can override in the values-21 directory:

<TextView
    android:id="@+id/three"
    style="@style/example"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:text="$20.13" />

Then you can define a base style configuration in values/styles.xml
<style name="example"> <item name="android:textSize">24sp</item> </style>
and then apply the fontFeatureSettings in values-v21/styles.xml
<style name="example"> <item name="android:textSize">24sp</item> <item name="android:fontFeatureSettings">tnum</item> </style>
I think the screenshot below with two emulators illustrates the difference well. You can clearly see that the columns do not line up between the three TextView fields.
On the right with "tnum"



Tuesday, August 30, 2016

Constant Code Reivew

Pair programming is constant code review.

Is there a column on your Kanban board for "Code Review"? Do you pair program? Why would you need that code review column?  That column is counter intuitive. It is not an agile software practice. I am not proposing you never code review. Actually the opposite. You are more likely doing constant code review.

We know that bugs are easier to fix the earlier they are found. So we pair program. Fix the bug as soon as it's typed with an attentive pair. You save time by doing it right the first time. Each time you or your pair inadvertently types a bug is a learning/teaching opportunity. Use these moments to talk and internalize how that bug came through and make a mental note to not let it happen again. When you make the mistake is the best time to learn from the mistake. So make sure your pair is being attentive and reviewing the code you type.

How big are your stories? Do they take more than one pairing session? Every time you pair-switch the incoming person should be reviewing the code. The review should cover the design patterns in use as well as looking for typical pitfalls where bugs crop up. Sure, there are probably other things that happen during pair switch. Make sure you review the code that came before! Maybe your stories are not bigger than one pairing session.  It does not matter.  Hold your pair accountable to be an active copilot.

The Agile Manifesto says "Individuals and interactions over processes and tools".  A code review column is putting process over your team members. We already established that you are pair programming. Why would you need to declare code review only happens after the pair team thinks the story is complete. What!? That doesn't make sense. How can the story be finished if it still needs a code review. Invest in your people. Don't let them use a code review column as a safety net that catches problems. The tightrope walker that has no net is much better at their craft than the one using a net.  They have to be or it's a really short career (grin). Take away your safety net to get better at YOUR craft.

Saturday, December 27, 2014

Quick Android Ringtone

My son was making himself at home in his Android phone Christmas present yesterday. He wanted a particular guitar solo as his ringtone.  Here's how I put it together. Spoiler alert, it's much easier than this:

  1. Slice out the guitar solo using itunes
  2. Convert the MP3 to ogg
  3. Added the ANDROID_LOOP metadata
  4. Copied the ringtone to the phone

Slice out the guitar solo using itunes

You can configure iTunes to export songs in MP3. You can also tell iTunes to start/stop playing at certain points in a song.
  1. select the song you want to use
  2. CMD-I to open the settings dialog
  3. Go to the Options tag to enter your start/stop time. This will likely take some fiddling to get the slice you want. With these set, the song will only play this section.
  4. Now open the File menu -> Create New Version -> Create MP3 Version
  5. You probably want to go back to the CMD-I properties dialog to clear the start/stop time of this song

Convert the MP3 to ogg

A drag-n-drop later I had an ogg file out of the mp3 by using Media Human. This ogg will work as a ringtone. However there is a long pause before it loops. This is not what we wanted. 

Added the ANDROID_LOOP metadata

I found Audacity to add the loop metadata key/value pair. Drag-n-drop the ogg file into Audacity. Then File menu -> Export Audio. Choose your destination file location and press the Save button. Now you get a new dialog where you can enter the new metadata key/value pair: ANDROID_LOOP:true.

Copied the ringtone to the phone

Android File Transfer works slick. Drag-n-drop the file from a Finder window into the Ringtones directory of android file transfer.  You don't have to disconnect the USB cable, navigate on your phone to Settings->Sounds and pick your ringtone!

Conclusion

In the end I could have just used Audacity since my music library is already MP3 format. I did not have to get iTunes to export an AAC into MP3. Audacity will let you select a section of song by clicking and dragging. Then further adjust the start/stop points. Simple go to the same File menu -> Export Selected Audio.