The Modern Engineer

Wednesday, January 14, 2015

Image Processing - Convolution Matrix

Introduction
Ah, image processing...a beautiful art form that has yet to be fully tapped. There are so many things that can be done with image processing: Beautifying a picture, edge detection, object recognition, etc. etc. However, those concepts can be EXTREMELY difficult to understand without some very basic foundations. That's the point of this lovely article.

The Convolution Matrix (Kernel)
In the most basic sense, a convolution is simply a multiplication of two functions across an infinite difference domain. Below is the definition of the convolution:

This can be difficult to envision with complex functions but becomes extremely useful in engineering most notably with Transfer Functions. Below are two great examples of how convolution works visually with two functions:

The convolution is used in many different applications and disciplines and can be considered (in my opinion) a fundamental mathematical operation.

So how do we use it for images? How is it even useful?

Well let's consider the images above and how the Outputs (black) are created. As the transfer function (Red) translates over the Source (Blue) you can see that the source will change as more iterations of Tau occur. Why is this occurring? Basically each point on the source is having itself multiplied by each point on the transfer function and summed up. The reason it may not look this way above is simply because t & tau are plotted on the same axis.

So now imagine the flow above is an image row and the red box is a matrix we swipe across the image performing a convolution. As the box travels across the image we are wanting to process, we're getting a result back (black). This is how the Convolution Matrix (or kernel) for an image works.

A convolution matrix is an odd matrix (matrix with odd dimensions 3x3, 5x5, 7x7, etc) that sums the intensities of surrounding pixels based on the numerical weight
value in the matrix row/column. The center value of the matrix is the primary pixel intensity weight.

The formula for a 2D convolution is shown below:

$y[m,n] = \sum_{j = -\infty}^{\infty} \sum_{i = -\infty}^{\infty} x[m,n] \cdot h[m-i, n-j]$

where x is the input image and h is the transfer function (convolution matrix). It is harder to depict how this matrix flows exactly over the image, basically the matrix is flipped in the x/y direction and multiplied and accumulated for the pixel intensity. Take a look at this example. They did a better job than I could have done visually explaining it.

So if you follow the equation above (using the link example for reference) you should easily be able to implement a basic convolution mechanism.

Kernel Filters

So now that you have the equations, how can we really use them? Well these are used in a large variety of ways. But many common filters that you see in different image processing applications use this basic concept to get the different effects. In the table below I have listed some common examples of kernels used and I have applied them to the moon image below so that you can see how it affects an image in different ways.

Original Image

Identity
Edge Detection 1
Edge Detection 2
Sharpen
Gaussian Blur (Approximate)

Conclusion
As you might can see, you can do pretty cool things with a Convolution Matrix. In general, convolution(s) are a powerful mathematical tool in any arsenal. (I may need to show how it's used in other fields).

I performed these convolution(s) using GIMP. Under the "Filter->Generic->Convolution Matrix..." menu item. Try it for yourself and see how cool it is.

If you have any questions, as always, hit me up!

Has it been 6, 12, or 18 months?

I'm sure there is some real disappointment out there from all my fans...all six of them...

Oh well. I'm trying to make a comeback. I used to love blogging and passing out COOL engineering information to those that desire knowledge. Unfortunately, I no longer am an Electrical Engineer. Instead I'm a software developer...GASP!! I know!

So although I will still do hardware posts (because you can never lose engineering, it's built into the soul), software may be more primary from here on out. So lets get started then shall we?

Tuesday, April 24, 2012

Optimization - Gradient (Steepest) Descent

Introduction

So you have an equation with 20 unknown variables. You've taken 1000 data points and need to get an equation to represent this data. Why would you need to do this?

Well it gives you a model of the data, which in turn can be used for simulations and estimations.

There are a lot of common uses for optimization. One use (in my field) is circuit optimization. When designing circuits, you sometimes need to fine tune the circuit for insertion loss or return loss. There have been times I've gotten stuck and used optimization to improve these parameters. In this article I'm going to give you a general understanding and try to provide a simple example.

General Algorithm for Gradient Descent Optimization

Gradient descent attempts to minimize a cost function by finding the gradient of the function and adjusting the variables in the opposite direction. The Gradient Descent algorithm tends to guarantee a minimum (not always, and sometimes not the global minimum).

This algorithm can be slow since you must compute a partial derivative for each unknown.

Cost Function
The first thing that needs to be created is a cost function. A cost function is function that we want to minimize.

For example, Let's say I have a circuit that is mismatched. I want to match it across a set of frequencies. Instead of doing it by hand I simply create an optimization algorithm. I want my return loss to be at least 30 dB. So the cost function would be (if Least-Squares).

$\bg_black$

where the Gamma_goal is 30 dB and Gamma_eqn is the output of our model.

The above is an example of the Least-Squares Cost Function. We're minimizing the difference between these two values. Gamma_goal is constant while Gamma_eqn is the model equation. In this case it's the Reflection coefficient.

Although the cost function can be anything, I tend to use least-squares most often. Here is a more generic form:

$S = \sum_{i=1}^{n} r_i^2$

$r_i = y_i - f(x_i,\beta)$

where beta is the parameters we're trying to find. yi and xi are the data points. In the case above, yi is the 30 dB number. I would have used there for every point.

Model Equation
The next problem is to create a model. The model must have a specific structure. It can be any equation with parameters. You can use this technique to fit lines to data as Excel does. In that case we would use an equation like so:

$\;f(x_i,\beta) = a_0 + a_1 x_i$

Using our data we would be trying to find a0 and a1. Now to find that data we must use the gradient descent algorithm.

Initial Value
Before we begin with the gradient descent, I must briefly speak about initial values. Initial values for the parameters ai must be set prior to the algorithm.

These initial values do matter (and sometimes to a large degree). This is largely because of the multiple valleys in the cost function that are possible. We desire to find the global minimum of the function but the gradient descent only locates local minima. If we're lucky it will find the global but this is not always the case.

So choose values that you believe to be close when using the gradient descent algorithm. This will increase the chances of finding the global minimum.

For instance, if we're performing line fitting, and we see the trend of the line passing through the y-axis at 20 then we should set a0 to 20.

Judgement must be used in picking initial values.

Gradient Descent Algorithm

The algorithm is based on the partial derivative of the parameters ai. Below is the equation used to determine the new values of ai:

$\;a_i^{'} = a_i - \gamma \frac{\partial S}{\partial a_i}$

The gamma is called the learning rate. The new ai' is calculated using a current ai. The learning rate should be chosen so that the jumps in the parameter is small enough not to miss the minimum and large enough so as not to take forever.

This also takes judgement but typically dynamic learning rates are used. These are learning rates that change based on the increase/decrease of the gradient.

Now the fun part...the partial derivative. In order for this algorithm to work, the partial derivative must be calculated. On a PC, an estimate works just fine for most applications. (It would defeat the purpose if we had to get an exact equation for it).

$\;\frac{\partial S}{\partial a_i} = \frac{S(a_0,a_1,...,a_i,...,a_N) - S(a_0,a_1,...,a_i + \epsilon,...,a_N)}{\epsilon}$

Epsilon is the change in the parameter to predict the limit. The closer this number to zero, the better. Of course PCs have limits but 1E-15 is usually small enough for most applications.

You must calculate all of these partial derivatives before changing the values. Never use new values of the parameters before all the derivatives have been calculated. This is a common mistake.

Once all of them are calculated, you can determine the new parameters. Then you repeat with the new values to find a closer approximation.

After so long your values will start to converge to some value. If this happens you have found a valid solution. Congrats!

There are caveats to this algorithm you must be aware of. Sometimes the learning rate is positive, sometimes negative. It's important to perform checks along the way to make sure the cost function is being reduced. If you don't do this it's possible to diverge with this algorithm as well.

Summary

The algorithm is fairly simple. Below is what must happen:

Develop a Model Equation for your data
Develop the Cost Function
Set Initial Values to the model
Determine all partial derivatives for the parameters (no parameter update)
Update parameters with the partial derivatives.
Perform any necessary checks
Repeat 4 - 6 until convergence

That should be about it. When you actually go to build this algorithm up, you will find some various things that must be done to make it efficient.

Good Luck and ask questions if you need help!

Sunday, September 11, 2011

Beyond SPICE

So here I am, working on a new version of my previous SPICE software. Of course to call something SPICE, it's usually a linear circuit simulator. When you start adding nonlinear models with S-Parameter, Harmonic Balance, and Various Large-Signal simulations you start to move past SPICE into something a little more :). So I have named this new version Beyond SPICE to show that it will be greater than a simple SPICE software.

So, in this I have started to create the basis of the software within the last 2 weeks. I'm starting (for the most part) from scratch. My last version was simply able to handle transient analysis for Resistors/Capacitors/Inductors/Current Source/Voltage Source.

This new version will handle this plus everything else.

Up to this point, I have the following elements in there first versions:

Complex Numbers and Parser:

The parser can parse directly to a complex number or can separate the equation into an expression tree. I will explain this in another post.
Can handle all variations of complex number and functions including hyperbolic functions.

Matrices:

A number of matrix classes have been created. Namely, a string matrix, complex constant matrix, and an expression tree matrix.
Complex Constant Matrix and Expression Tree Matrix can be inverted along with determinants. The simple things

Although, this may not seem like much, the value in the time spent in these 2 major class libraries has a vast impact on the speed of the final product. For instance, the Expression Tree allows for a complex statement to be parsed only once and evaluated many times (POEM - Parse Once Evaluate Many). This means that parsing doesn't need to occur directly every iteration. In fact, incorporating this into a matrix speeds up the time even more.

Secondly, having the invertability of the Expression Tree Matrix allows for inversion to only occur once and then evaluate the inverse matrix directly without having to perform the inverse at each time step.

Taking the time to develop these algorithms will show great improvement in performance down the road.

Hopefully I can keep you guys up to date more often. Any questions, feel free to email me.

---------------------
Justin Coulston
justin.coulston@gmail.com

Monday, July 4, 2011

I'm not gone

I just wanted to make sure everyone knows that I'm not down and out yet. I know it's been a while, but I plan on getting some new material into the blog. A few of the new things I'll be talking about soon will be about some design techniques that can be employed for RF Engineering. Some of these may seem simple and archaic, but can still be very useful when designing RF Chains.

Some of the topics will include:

Filter Design Techniques
Impedance Matching Techniques
Network Analysis Mathematics
Neural Networks in RF Engineering
Using software design for hardware problems
and more...

Hopefully, I can start introducing some of this material over the next year. Look for it!

Sunday, May 2, 2010

Signal Processing: Modulation/Demodulations - The Basics

I want to give a basic understanding of Modulation and Demodulation, the process in which we transform signals into different waveforms by the changing of carrier properties (amplitude, frequency, phase, etc.). I want to state that I am relatively new to modulation techniques and can only offer minimal insight into it's details. Regardless...

Modulation, as above, is a process in which signals are transformed into different waveforms by changing basic properties. For instance, say we have a bit stream we want to send over the air. We can't just pulse 5V directly to an antenna to send the information. We must first modulate the signal into a usable RF signal somehow...

There are a number of ways to do this:

Amplitude Modulation: The process of using different amplitudes to represent different values.
Frequency Modulation: The process where we change the frequency to represent different values.
Phase-Shift Modulation: The process where we change the phase of a signal to represent different values.

Courtesy National Instruments

There are many more but these are the basics. So we would have to come up with a conversion scheme for our digital stream. You would think this is the easy part (and in cases is) but the issue that arises is bandwidth.

Many times we must keep signals within certain FCC bandwidths. For example, I may have been allocated a certain bandwidth from 45.6MHz to 46.2MHz. If I transmit outside this range, I'm in trouble. So modulation will usually be based around this constraint first. After transforming the digital signal we would then modulate the signal onto a 46MHz waveform to stay within bounds. (Of course this is simplified)

Regardless, this is the basics of modulation from a systems point of view. From the circuits point of view we use mixers, PLLs, and oscillators.

Demodulation is just the inverse of this process. We will generally filter out the frequency we are looking for (so we don't get other garbage from the air) and reverse the process to retrieve the original signal.

That's it. Not much more about modulation/demodulation I can tell. It's crazy, I use it often where I work as an RF engineer, but I know so little...(I am new to the industry though).

Hope this is somewhat useful.

------------------------

Justin Coulston

justin.coulston@gmail.com

Saturday, May 1, 2010

Simulation: NPR Test Simulation

So during my time at work today (yes I worked on Saturday), I made an attempt to try and simulate an NPR test for an amplifier in Agilent's EESof/ADS. Now there is a way to accomplish this using the DSP portion along with the Analog Envelope technique but I haven't found a way to do it directly with the RF/Analog system. This is quite the disappointment...

Figure 1: NPR Test Stimulus (courtesy Agilent)

So I have been researching into solutions. Only one has presented itself up to this point: Spectral Balance Simulations. Unfortunately, ADS doesn't support this directly (at least not to my knowledge). This is different than the typical Harmonic Balance Simulation most people are used to. In fact the main difference is that the Spectral Balance simulation stays in frequency domain. The Harmonic Balance simulation instead does a transient analysis then an FFT to complete the simulation (Of course it's more complex than this).

So how do I get around it? I really have no idea right now. The fact is, this is a known issue. Why hasn't someone developed this into the more popular SPICE simulations? There are random Matlab tools that can do the trick but not directly in SPICE packages.

Anyways, that's my rant for the day. I'll go into more detail, hopefully, at a later date.

---------------------------------
Justin Coulston
justin.coulston@gmail.com

Search This Blog