Cookie Notice

As far as I know, and as far as I remember, nothing in this page does anything with Cookies.

2017/05/22

Testing Perl Modules on Windows: A Cry For Help

I've been working on modules again, after a recent push, and I found a big project whose .travis.yml file only went to Perl 5.20. I thought I'd get some dev brownie points by adding the next two stable versions, and found that my build was dying in the Pull Request submission process.

Specifically, it was dying on the checks. It passes Travis-CI, which runs the tests on Unix-like systems, but it was failing with Appveyor.

"What is Appveyor?", I asked myself.

Perhaps this isn't a direct quote.

Appveyor is a continuous integration service that tests against Windows systems. Scott Hanselman wrote a glowing review of it three years ago.

But there's no .appveyor.yml file in the project, so it runs and fails.

I've mentioned this on the project IRC, and no, I'm not going to name names, because there is movement toward testing on Windows, and even if it doesn't work, I admire the goal.

I wrote this three years ago, in response to a Python conference video:
2) Sure, real programmers use Unix/Linux to run their code, real programmers, but beginner programmers don't come in knowing how to set up an environment and come in with the (Windows) computer they have, and the documentation sucks and they feel lost and they don't like it and they don't feel the power, and they're gone. Even if you dislike Microsoft, good tools for Windows and good documentation are important for new programmers, important for building community, important for inclusiveness.
I run Windows, but I program on and for Linux, and only have one project based on Windows. But I have installed ActiveState and Strawberry Perls, and think that if you write general-purpose modules, you should be sure to test against Windows as well as Linux.

But, Travis-CI has documentation covering Perl projects. Appveyor says you can use Perl 5.20. eserte wrote a post on Appveyor for blogs.perl.org last year, but I'd love to see better documentation from either them, the Perl community or both. Following is the YAML from eserte, with a switch to "check only master branch", but as with Travis, which uses perlbrew and allows testing as far back as 5.8.8, I think having it test against older versions of Perl, both ActivePerl and Strawberry, would be the thing.

branches:
  only:
    - master

skip_tags: true

cache:
  - C:\strawberry



install:

  - if not exist "C:\strawberry" cinst strawberryperl

  - set PATH=C:\strawberry\perl\bin;C:\strawberry\perl\site\bin;C:\strawberry\c\bin;%PATH%

  - cd C:\projects\%APPVEYOR_PROJECT_NAME%

  - cpanm --installdeps .



build_script:

  - perl Makefile.PL

  - dmake test


If you have a more fully-featured .appveyor.yml file you'd like the Perl community to use, especially distinguishing MakeMaker modules from Dist::Zilla modules, I'd love to see it.

2017/05/17

Contact Me! (If you REALLY need to)

A recent comment from Schlomi Fish said:
Hi! I cannot seem to find any contact information on this page. How should I contact you?
And then linked to a FAQ entry explaining his position on the state of email, comparing the futility of hiding addresses and the benefits of being open.

I have to say, I hadn't thought about this in ... years? In general, I'm active on Twitter (@jacobydave), which is good if you are, but not helpful if you aren't. I try to keep track of the comments, but that doesn't fit every message a person would want to send me.

So, a friendly "Hey, you should put your email on your blog" comment makes sense to me.

But, adding more traffic to the mailbox that friends and relatives have access to doesn't. I'm happy to put up an email address, but I'm less than happy to make it my main email address. It goes to context; my coworkers generally don't get that one either.

I had a long, barely touched by me but used enough by others project that was R syntax highlighting in Komodo Edit, which is dead because it's now native, but the ActiveState packaging used an email address to set the id, so, I created rlangsyntax@gmail.com.

So, to the right, in a section called "More Of Me", there is a requested mailto: pointing to rlangsyntax@gmail.com. I will check it. Use it in good health.

2017/05/08

Coffee and Code and Calendars and R

I have what you might call a conflicted relationship with caffeine.



Long story short: I found it necessary to cut down on caffeine, limiting myself to two cups a day, preferrably before noon, and, as sort of a measure of accountability, and as a way to gain more skill with SQL and R, I wrote tools that store when I drink coffee and, at the end of the work day, tweeting the image.


I forget exactly where I found the calendar heatmap code, but it was several years ago, and was one of the first instances of ggplot2 that I found and put into service. I chose a brown-to-black color scheme because, well, obviously it needed to look like coffee.

This image, with "Dave has had n cups of coffee today" is autotweeted every day at 6pm Eastern. Recently, it has drawn interest.


So here it is, both blogged and gisted.


I started doing that thing with YAML in Perl, to keep database keys out of programs. I'm reasonably okay with my SQL skills, I think, but I am clear that my R code is largely cargo-cult. It'd be good to replace 2,4,6 with the days of the week,  and I am reasonably sure that it's reversed from the way you'd normally expect weekdays to go. Some day, I'll have the knowledge required to make those changes.

If you have questions and comments about this, I'd be glad to take them on, but I'm very much the learner when it comes to R.

count is not uniq...

I recall reading something online about 20 years ago (gasp!) where the authors were looking for a core set of knowledge that would constitute "knowing Unix", and found that there just wasn't. Knowing Unix was like the Humpty Dance, in that no two people do it the same.

And, presumably, you have Unix down when you appear to be in pain.


I have been a Unix/Linux user since the 1990s and I only found out about uniq -c because of the last post. I had been using sort | uniq forever, and have recently stopped in favor of sort -u, which I also learned about recently.

I find that uniq is kinda useless without a sort in front of it; if your input is "foo foo foo bar foo" (with requisite newlines, of course), uniq without sort will give you "foo bar foo" instead of "foo bar" or "bar foo", either of which is closer to what I want.

So, I could see adding alias count=" sort | uniq " to my bash setup, but adding a count program to my ~/bin seems much better to me, much closer to the Right Thing.

marc chantreux suggested an implementation of count that is perhaps better and certainly shorter than the one I posted. There was regex magic that I simply didn't need, because I wanted the count to stand by itself (but I might revisit to remove the awk step, because as a user, I'm still a bit awkward with it.)

B)

my %seen ;

map { $seen{$_}++ } do {
    @ARGV ? @ARGV : map { chomp ; $_ } <>;
    } ;

while ( my ( $k, $v ) = each %seen ) {
    say join "\t", $v, $k ;
    }
I like marc's use of the ternary operator to handle STDIN vs @ARGV, but I'm somewhat inconsistently against map over for. I know people who thing that map not going into an array is a problem, so I don't go back to it often.

I do, however, do for my $k ( keys %seen ) { ... } enough that I'm sort of mad at myself for not encountering each before.

ETA: It's been brought to my attention that using map {} as a replacement for for () {} is not good.


2017/05/05

One! One New Utility! Bwa-ha-hahaha!

Classic unix utilities give you a number of great tools, and you can use sed and awk and bash when those aren't enough.

But sometimes ...

I use ~ as a scratch space all too often, which leaves me with a huge amount of files that I stopped playing with a while ago. I can get to the point of knowing what types, sure, as I show here.

$ ls *.* | awk -F. '{ print $NF }' 
jpg
jpg
jpg
jpg
txt
txt
txt
pl
txt
txt
pl
txt
pl
pl
txt
html
pl
pl
gz
mp3
pl
pl
pl
pl
txt
pl
pl
sh
sh
txt
pl
pl
diff
txt
txt
txt
pl
txt
pl
txt
txt
txt
txt
py
...

But this only gets you so far. I can sort and know that there's a LOT of Perl files, perhaps too many, but nothing was immediate about telling me how many.

But hey, I am a programmer, so I wrote a solution.

And here it is in a shell, combined with sort in order to give me the numbers, which includes a lot of throwaway Perl programs.

$ ls *.* | awk -F. '{ print $NF }' | count | sort -nr

95 pl
59 txt
10 sh
10 jpg
8 py
6 html
6 csv
5 js
2 gz
2 diff
1 zip
1 ttf
1 tt
1 svg
1 sql
1 Rd
1 R
1 pub
1 png
1 pdf
1 mp4
1 mp3
1 log
1 json
1 foo
1 conf
1 cnf


I suppose I need to do some cleanup in $HOME today...

2017/03/14

Coding for Pi Day

Today is Pi Day, which is a good day to talk about Pi.

Normally, I'd probably use Pi, sine and cosine to draw things, but instead, I flashed on a couple ways to estimate Pi.

Also, showing you can use Unicode characters in Perl.

#!/usr/bin/env perl

use feature qw{ say } ;
use strict ;
use warnings ;
use utf8 ;

my $π = 3.14159 ;

my $est2  = estimate_2() ;
my $diff2 = sprintf '%.5f',abs $π - $est2 ;
say qq{Estimate 2: $est2 - off by $diff2} ;

my $est1  = estimate_1() ;
my $diff1 = sprintf '%.5f',abs $π - $est1 ;
say qq{Estimate 1: $est1 - off by $diff1} ;

exit ;

# concept here is that the area of a circle = π * rsquared
# if r == 1, area = π. If we just take the part of the circle
# where x and y are positive, that'll be π/4. So, take a random
# point between 0,0 and 1,1 see if the distance between it and 
# 0,0 is < 1. If so, we increment, and the count / the number
# so far is an estimate of π.

# because randomness, this will change each time you run it

sub estimate_1 {
    srand ;
    my $inside = 0.0 ;
    my $pi ;
    for my $i ( 1 .. 1_000_000 ) {
        my $x = rand ;
        my $y = rand ;
        $inside++ if $x * $x + $y * $y < 1.0 ;
        $pi = sprintf '%.5f', 4 * $inside / $i ;
        }
    return $pi ;
    }

# concept here is that π can be estimated by 4 ( 1 - 1/3 + 1/5 - 1/7 ...)
# so we get closer the further we go
sub estimate_2 {
    my $pi = 0;
    my $c  = 0;
    for my $i ( 0 .. 1_000_000 ) {
        my $j = 2 * $i + 1 ;
        if ( $i % 2 == 1 ) { $c -= 1 / $j ; }
        else               { $c += 1 / $j ; }
        $pi = sprintf '%.5f', 4 * $c ;
        }
    return $pi ;
    }

2017/02/28

Having Problems Munging Data in R

#!/group/bioinfo/apps/apps/R-3.1.2/bin/Rscript

# a blog post in code-and-comment form

# Between having some problems with our VMs and wanting 
# to learn Log::Log4perl. I wrote a program that took 
# the load average -- at first at the hour, via 
# crontab -- and stored the value. And, if the load 
# average was > 20, it would send me an alert

# It used to be a problem. It is no longer. Now I 
# just want to learn how to munge data in R

# read in file
logfile = read.table('~/.uptime.log')

# The logfile looks like this:
#
#   2017/01/01 00:02:01 genomics-test : 0.36 0.09 0.03
#   2017/01/01 00:02:02 genomics : 0.04 0.03 0.04
#   2017/01/01 00:02:02 genomics-db : 0.12 0.05 0.01
#   2017/01/01 00:02:04 genomics-apps : 1.87 1.24 0.79
#   2017/01/01 01:02:02 genomics-db : 0.24 0.14 0.05
#   2017/01/01 01:02:02 genomics-test : 0.53 0.14 0.04
#   2017/01/01 01:02:03 genomics : 0.13 0.09 0.08
#   2017/01/01 01:02:04 genomics-apps : 1.66 1.82 1.58
#   2017/01/01 02:02:01 genomics-test : 0.15 0.03 0.01
#   ...

# set column names
colnames(logfile)=c('date','time','host','colon','load','x','y')

# now:
#
#   date       time     host         colon load x y
#   2017/01/01 00:02:01 genomics-test : 0.36 0.09 0.03
#   2017/01/01 00:02:02 genomics : 0.04 0.03 0.04

logfile$datetime <- paste( as.character(logfile$date) , as.character(logfile$time) )
# datetime == 'YYYY/MM/DD HH:MM:SS'
logfile$datetime <- sub('......$','',logfile$datetime)
# datetime == 'YYYY/MM/DD HH'
logfile$datetime <- sub('/','',logfile$datetime)
# datetime == 'YYYYMM/DD HH'
logfile$datetime <- sub('/','',logfile$datetime)
# datetime == 'YYYYMMDD HH'
logfile$datetime <- sub(' ','',logfile$datetime)
# datetime == 'YYYYMMDDHH'

# for every datetime in logfile. I love clean data

# removes several columns we no longer need

logfile$time    <- NULL
logfile$date    <- NULL
logfile$colon   <- NULL
logfile$x       <- NULL
logfile$y       <- NULL

# logfile now looks like this:
#
#   datetime  host             load
#   2017010100 genomics-test    0.36 
#   2017010100 genomics         0.04 
#   2017010100 genomics-db      0.12 
#   2017010100 genomics-apps    1.87 
#   2017010101 genomics-db      0.24 
#   2017010101 genomics-test    0.53 
#   2017010101 genomics         0.13 
#   2017010101 genomics-apps    1.66 
#   2017010102 genomics-test    0.15 
#   ...

# and we can get the X and Y for a big huge replacement table
hosts <- unique(logfile$host[order(logfile$host)])
dates <- unique(logfile$datetime)

# because what we want is something closer to this
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010100      0.04        1.87            0.12            0.36
#   2017010101      0.13        1.66            0.15            0.53
#   ...

# let's try to put it into a dataframe

uptime.data <- data.frame()
uptime.data$datetime <- vector() ;
for ( h in hosts ) {
    uptime.data[h] <- vector()
    } 

# and here, we have a data frame that looks like 
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#
# as I understand it, you can only append to a data frame by merging.
# I need to create a data frame that looks like
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010100      0.04        1.87            0.12            0.36
#
# and then merge that. Then do the same with 
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010101      0.13        1.66            0.15            0.53
#
# and so on.
#
# I don't know how to do that. 
#
# I *think* the way is make a one-vector data frame:
#
#   datetime        
#   2017010101      
#
# and add the vectors one at a time.

for ( d in dates ) {

    # we don't and the whole log here. we just want 
    # this hour's data
    # 
    #   datetime  host             load
    #   2017010100 genomics-test    0.36 
    #   2017010100 genomics         0.04 
    #   2017010100 genomics-db      0.12 
    #   2017010100 genomics-apps    1.87 
    log <- subset(logfile, datetime==d)

    print(d)

    for ( h in hosts ) {
        # and we can narrow it down further
        # 
        #   datetime  host             load
        #   2017010100 genomics         0.04 
        hostv <- subset(log,host==h)
        load = hostv$load 
        # problem is, due to fun LDAP issues, sometimes 
        # the logging doesn't happen
        if ( 0 == length(load) ) { load <- -1 }
        print(paste(h, load ))
    }

    # and here's where I'm hung. I can get all the pieces 
    # I want, even -1 for missing values, but I can't seem  
    # to put it together into a one-row data frame
    # to append to uptime.data. 

    #   [1] "2017010100"
    #   [1] "genomics 0.04"
    #   [1] "genomics-apps 1.87"
    #   [1] "genomics-db 0.12"
    #   [1] "genomics-test 0.36"
    #   [1] "2017010101"
    #   [1] "genomics 0.13"
    #   [1] "genomics-apps 1.66"
    #   [1] "genomics-db 0.24"
    #   [1] "genomics-test 0.53"
    #   [1] "2017010102"
    #   [1] "genomics 0.36"
    #   [1] "genomics-apps 0.71"
    #   [1] "genomics-db 0.08"
    #   [1] "genomics-test 0.15"

}

2017/01/20

Ding! Ding! The Process is Dead!

Starts with a thing I saw on David Walsh's Blog:
I've been working with beefy virtual machines, docker containers, and build processes lately. Believe it or not, working on projects aimed at making Mozilla developers more productive can mean executing code that can take anywhere from a minute to an hour, which in itself can hit how productive I can be. For the longer tasks, I often get away from my desk, make a cup of coffee, and check in to see how the rest of the Walsh clan is doing.

When I walk away, however, it would be nice to know when the task is done, so I can jet back to my desk and get back to work. My awesome Mozilla colleague Byron "glob" Jones recently showed me his script for task completion notification and I forced him to put it up on GitHub so you all can get it too; it's called ding!
OK, that sounds cool. So I go to Github and I see one line that gives me pause.

Requires ding.mp3 and error.mp3 in same directory as script. OSX only.

I can handle the image thing, but I don't own or run an OSX computer. (I have one somewhere, but it's ancient and has no functioning battery. I don't use it.)

"So," I think, "how could I do this on my Linux box? What's the shortest path toward functionality on this concept?"

Well, recently, I have been playing with Text-to-Speech. Actually, I have been a long-time user of TTS, using festival then espeak to tell me the current time and temperature on the hour and half-hour. I switched to Amazon's Polly in December, deciding that the service sounded much better than the on-my-computer choices. (Hear for yourself.) So, I knew how to handle the audio aspects.

The other part required me to get much more familiar with Perl's system function than I had been previously.


I'm not yet 100% happy with this code, but I'm reasonably okay with it so far. Certainly the concept has been proven. (I use the audio files from globau's ding.) With enough interest, I will switch it from being a GitHub gist to being a repo.

2016/11/19

Graphs are not that Scary!



As with most things I blog about, this starts with Twitter. I follow a lot of people on Twitter, and I use Lists. I want to be able to group people more-or-less on community, because there's the community where they talk about programming, for example, and the community where they talk about music, or the town I live in.

I can begin to break things up myself, but curation is a hard thing, so I wanted to do it automatically. And I spent a long time not knowing what to do. I imagined myself traversing trees in what looks like linked lists reimagined by Cthulhu, and that doesn't sound like much fun at all.

Eventually, I decided to search on "graphs and Perl". Of course, I probably should've done it earlier, but oh well. I found Graph. I had used GD::Graph before, which is a plotting library. (There has to be some index of how overloaded words are.) And once I installed it, I figured it out: As a programmer, all you're dealing with are arrays and hashes. Nothing scary.

Word Ladder


We'll take a problem invented by Lewis Carroll called a "word ladder", where you find your way from one word (for example, "cold") to another ("warm") by changing one letter at a time:

    cold
    coRd
    cArd
    Ward
    warM

Clearly, this can and is often done by hand, but if you're looking to automate it, there are three basic problems: what are the available words, how do you determine when words are one change away, and how do you do this to get the provable shortest path?

First, I went to CERIAS years ago and downloaded word lists. Computer security researchers use them because real words are bad passwords, so, lists of real words can be used to create rainbow tables and the like. My lists are years old, so there may be new words I don't account for, but unlike Lewis Carroll, I can get from APE to MAN in five words, not six.

    ape
    apS
    aAs
    Mas
    maN

Not sure that Lewis Carroll would've accepted AAS, but there you go

There is a term for the number of changes it takes to go from one word to another, and it's called the Levenshtein Distance. I first learned about this from perlbrew, which is how, if you type "perlbrew isntall", it guesses that you meant to type "perlbrew install". It's hardcoded there because perlbrew can't assume you have anything but perl and core modules. I use the function from perlbrew instead of Text::Levenshtein but it is a module worth looking into.

And the final answer is "Put it into a graph and use Dijkstra's Algorithm!"

Perhaps not with the exclamation point.

Showing Code


Here's making a graph of it:

#!/usr/bin/env perl

use feature qw{say} ;
use strict ;
use warnings ;

use Data::Dumper ;
use Graph ;
use List::Util qw{min} ;
use Storable ;

for my $l ( 3 .. 16 ) {
    create_word_graph($l) ;
    }
exit ;

# -------------------------------------------------------------------
# we're creating a word graph of all words that are of length $length
# where the nodes are all words and the edges are unweighted, because
# they're all weighted 1. No connection between "foo" and "bar" because 
# the distance is "3".

sub create_word_graph {
    my $length = shift ;
    my %dict = get_words($length) ;
    my @dict = sort keys %dict ; # sorting probably is unnecessary
    my $g    = Graph->new() ;

    # compare each word to each word. If the distance is 1, put it
    # into the graph. This implementation is O(N**2) but probably
    # could be redone as O(NlogN), but I didn't care to.

    for my $i ( @dict ) {
        for my $j ( @dict ) {
            my $dist = editdist( $i, $j ) ;
            if ( $dist == 1 ) {
                $g->add_edge( $i, $j ) ;
                }
            }
        }

    # Because I'm using Storable to store the Graph object for use
    # later, I only use this once. But, I found there's an endian
    # issue if you try to open Linux-generated Storable files in
    # Strawberry Perl.

    store $g , "/home/jacoby/.word_$length.store" ;
    }

# -------------------------------------------------------------------
# this is where we get the words and only get words of the correct
# length. I have a number of dictionary files, and I put them in
# a hash to de-duplicate them.

sub get_words {
    my $length = shift ;
    my %output ;
    for my $d ( glob( '/home/jacoby/bin/Toys/Dict/*' ) ) {
        if ( open my $fh, '<', $d ) {
            for my $l ( <$fh> ) {
                chomp $l ;
                $l =~ s/\s//g ;
                next if length $l != $length ;
                next if $l =~ /\W/ ;
                next if $l =~ /\d/ ;
                $output{ uc $l }++ ;
                }
            }
        }
    return %output ;
    }

# -------------------------------------------------------------------
# straight copy of Wikipedia's "Levenshtein Distance", straight taken
# from perlbrew. If I didn't have this, I'd probably use 
# Text::Levenshtein.

sub editdist {
    my ( $f, $g ) = @_ ;
    my @a = split //, $f ;
    my @b = split //, $g ;

    # There is an extra row and column in the matrix. This is the
    # distance from the empty string to a substring of the target.
    my @d ;
    $d[ $_ ][ 0 ] = $_ for ( 0 .. @a ) ;
    $d[ 0 ][ $_ ] = $_ for ( 0 .. @b ) ;

    for my $i ( 1 .. @a ) {
        for my $j ( 1 .. @b ) {
            $d[ $i ][ $j ] = (
                  $a[ $i - 1 ] eq $b[ $j - 1 ]
                ? $d[ $i - 1 ][ $j - 1 ]
                : 1 + min( $d[ $i - 1 ][ $j ], $d[ $i ][ $j - 1 ], $d[ $i - 1 ][ $j - 1 ] )
                ) ;
            }
        }

    return $d[ @a ][ @b ] ;
    }

Following are what my wordlists can do. Something tells me that, when we get to 16-letter words, it's more a bunch of disconnected nodes than a graph.

1718 3-letter words
6404 4-letter words
13409 5-letter words
20490 6-letter words
24483 7-letter words
24295 8-letter words
19594 9-letter words
13781 10-letter words
8792 11-letter words
5622 12-letter words
3349 13-letter words
1851 14-letter words
999 15-letter words
514 16-letter words

My solver isn't perfect, and the first thing I'd want to add is ensuring that both the starting and ending words are actually in the word list. Without that, your code goes on forever.

So, I won't show off the whole program below, but it does use Storable, Graph and feature qw{say}.

dijkstra( $graph , 'foo' , 'bar' ) ;

# -------------------------------------------------------------------
# context-specific perl implementation of Dijkstra's Algorithm for
# shortest-path

sub dijkstra {
    my ( $graph, $source, $target, ) = @_ ;

    # the graph pre-exists and is passed in 
    # $source is 'foo', the word we're starting from
    # $target is 'bar', the word we're trying to get to

    my @q ; # will be the list of all words
    my %dist ; # distance from source. $dist{$source} will be zero 
    my %prev ; # this holds our work being every edge of the tree
               # we're pulling from the graph. 

    # we set the the distance for every node to basically infinite, then 
    # for the starting point to zero

    for my $v ( $graph->unique_vertices ) {
        $dist{$v} = 1_000_000_000 ;    # per Wikipeia, infinity
        push @q, $v ;
        }
    $dist{$source} = 0 ;

LOOP: while (@q) {

        # resort, putting words with short distances first
        # first pass being $source , LONG WAY AWAY

        @q = sort { $dist{$a} <=> $dist{$b} } @q ;
        my $u = shift @q ;

        # say STDERR join "\t", $u, $dist{$u} ;

        # here, we end the first time we see the target.
        # we COULD get a list of every path that's the shortest length,
        # but that's not what we're doing here

        last LOOP if $u eq $target ;

        # this is a complex and unreadable way of ensuring that
        # we're only getting edges that contain $u, which is the 
        # word we're working on right now

        for my $e (
            grep {
                my @a = @$_ ;
                grep {/^${u}$/} @a
            } $graph->unique_edges
            ) {

            # $v is the word on the other end of the edge
            # $w is the distance, which is 1 because of the problem
            # $alt is the new distance between $source and $v, 
            # replacing the absurdly high number set before

            my ($v) = grep { $_ ne $u } @$e ;
            my $w   = 1 ;
            my $alt = $dist{$u} + $w ;
            if ( $alt < $dist{$v} ) {
                $dist{$v} = $alt ;
                $prev{$v} = $u ;
                }
            }
        }

    my @nodes = $graph->unique_vertices ;
    my @edges = $graph->unique_edges ;
    return {
        distances => \%dist,
        previous  => \%prev,
        nodes     => \@nodes,
        edges     => \@edges,
        } ;
    }

I return lots of stuff, but the part really necessary is %prev, because that, $source and $target are everything you need. Assuming we're trying to go from FOR to FAR, a number of words will satisfy $prev{FOR}, but it's the one we're wanting. In the expanded case of FOO to BAR, $prev->{BAR} = 'FAR', $prev->{FAR} is 'FOR', and $prev->{FOR} is 'FOO'.

And nothing in there is complex. It's all really hashes or arrays or values. Nothing a programmer should have any problem with.

CPAN has a number of other modules of use: Graph::Dijkstra has that algorithm already written, and Graph::D3 allows you to create a graph in such a way that you can use it in D3.js. Plus, there are a number of modules in Algorithm::* that do good and useful things. So go in, start playing with it. It's deep, there are weeds, but it isn't scary.

2016/11/08

Modern Perl but not Modern::Perl

This started while driving to work. If I get mail from coworkers, I get Pushover notifications, and halfway from home, I got a bunch of notifications.

We don't know the cause of the issue, but I do know the result

We have env set on our web server set so that Perl is /our/own/private/bin/perl and not /usr/bin/perl, because this runs in a highly-networked and highly-clustered environment, mostly RedHat 6.8 and with 5.10.1 as system perl, if we want to have a consistent version and consistent modules, we need our own. This allows us to have #!/usr/bin/env perl as our shbang.

And this morning, for reasons I don't know, it stopped working. Whatever perl was being called, it wasn't /our/own/private/bin/perl. And this broke things.

One of the things that broke is this: Whatever perl is /usr/bin/env perl, it doesn't have Modern::Perl.

I'm for Modern Perl. My personal take is that chromatic and Modern Perl kept Perl alive in with Perl 5 while Larry Wall and the other language developers worked on Perl 6. Thus, I am grateful that it exists. But, while I was playing with it, I found a problem: Modern::Perl is not in Core, so you cannot rely on it being there, so a script might be running with a version greater than 5.8.8 and be able to give you everything you need, which to me is normally use strict, use warnings and use feature qw{say}, but if you're asking for Modern::Perl for it, it fails, and because you don't know which Modern thing you want, you don't know how to fix it.

This is part of my persistent hatred of magic. If it works and you don't understand how, you can't fix it if it stops working. I got to the heavy magic parts of Ruby and Rails and that, as well as "Life Happens", are why I stopped playing with it. And, I think, this is a contributing factor with this morning's brokenness.