Ben Wellington, who runs the excellent I Quant New York  did a nice job here unpacking NYC taxi charge data and found that two different systems installed in taxis calculate driver tips differently. 

And he estimates the more generous (?) system produces $5.2 million in tips above what the other system provides. Of course, riders have been unaware and Wellington’s analysis suggests that the Taxi Commission and drivers were probably unaware as well. This is why we should “use the damn data.” And it’s why public data should be open to the public. Even the best intentioned and most capable public officials do not have the time or resources to explore it all. Making data public enables people like Wellington to do their own explorations. 

So, even when it’s in a good visualization, it’s not just a matter of looking at the data. It’s also a matter of thinking about it. It’s also a matter of following its logic and asking questions about what each type of data really represents and how different fields relate to one another (and often to what’s missing).

Wellington makes some cogent suggestions. Here are the summary points, but you ought to take a look at his post:

  • The TLC (Taxi and Limousine Commission) should fix the in-cab payment systems to make them consistent with one another.
  • The TLC should release taxi data directly to the public, not through FOIL.
  • When doing data science, look at the raw data.
  • Link to or publish your data sources.

All excellent suggestions. Kudos to Ben Wellington. 

And, by the way, check this post of Wellington’s as well. If you’ve ever had a NYC MTA Metrocard, the value of which you did not fully exhaust, this is an excellent example of how Open Data can lead to very concrete suggestions that save the public a lot of money. The value of this one is even greater than from tips.

{ 0 comments }

Andrew Cuomo, Governor of New York in his combined State of the State and Budget Address, January 21, 2015.

“When people complain about high taxes in New York, they’re talking about the property tax.”

OK, how much, for what and where? Mostly schools and secondarily counties. These data are for over 3,200 local governments outside of New York City. They include:

  • School districts
  • Counties
  • Cities
  • Towns
  • Villages
  • Fire Districts
The data are from annual financial reports submitted by each local government to the Office of the State Comptroller. In some cases, local governments have not submitted their reports or have not submitted them timely. Not all local governments use the same fiscal year. For purposes of simplicity, in all cases, the data are displayed based on the calendar year that fiscal years end. 

Property Tax by Region Outside NYC by Type of Municipality 2013 Public Signals LLC

 

 

 

 

 

 

 

 

Here’s a table with the summary numbers:

Table of Property Taxes NYS Outside NYC FY 2013 by Region by Type Public Signals LLC

 

 

 

 

 

 

 

 

And the trends? Again, schools. Though it’s hard to see in the detail, in the aggregate, village property taxes now exceed those levied by cities.

Local Government Property Taxes NYS Outside NYC by Type of Government 1998 2013

 

 

 

 

 

 

 

 

 

 

As is so often the case, there’s lots of variability. In this graphic, each dot represents a local governmental unit, color coded by type, and organized by region.

Scatter of Property Taxes per $1 000 Value by Region Color Coded by Municipal Type 2013 Public Signals LLC

 

 

 

 

 

 

 

 

Hope you find this useful.

{ 0 comments }

Continuing the discussion regarding the City of Albany, one of the participants was concerned about debt and the potential for taking on excess amounts. That’s always a legitimate concern.

So I ran some comparisons. In the first, for selected cities in New York, you’ll see a really good indicator. That is debt service (principle and interest) as a percentage of revenue. Along with the actual rates, you’ll see the trend and average for each entity individually. Additionally, the grey band that runs across the entire graphic represents 80 – 120 percent of the average of all entities shown.

For purposes of comparability, these cities have populations between 25,000 and 199,999. Each column has the trend data for a city. You’ll see the actual data graphed from 1998 to 2013. In this graphic, there’s no data for Ithaca for 2013.

Upstate NYS Cities Debt Service Trens

 

Please pardon the small print. Here’s the same graphic as a PDF: Public Signals LLC Selected NYS Cities Debt as Pct of Rev 1.pdf

Some quick observations:

  • Albany and Troy look average and relatively stable. The average for Albany was 9.9 percent and for Troy, it was 8.4.
  • It looks like the City of Niagara Falls either paid off some debt or got a big and recurring chunk of revenue. I’d bet on the former.
  • The trend of greatest concern would be for the City of Binghamton. Though in the most recent year shown, 2013, the numbers came down from their particular peak, the overall direction may be a concern.
  • The trend for Saratoga Springs is also upward sloping, but it started from a relatively low base so even in 2013, it’s lower than most.
  • Interestingly, there appear to be more cities whose numbers are trending downward than upward. Auburn, Jamestown and Rome declined and then stabilized, but North Tonawanda, Schenectady, Syracuse, Watertown all showed a pretty steady decline.

Here’s a simpler view for 2013:

Debt Per Capita  Selected NYS Cities  2013  Public Signals LLC

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
None of this is to suggest that there is a “right” rate. Moreover, it’s important to recognize that rates may be higher in places that have invested more on upgrading and maintaining infrastructure and capital assets and that if well targeted, that can benefit the community and improve its economic prospects. That’s a more sophisticated analysis than what’s offered here. But making these sorts of comparisons among similar jurisdictions and over time is essential to awareness and insight.

Hope you find it helpful.

 

{ 0 comments }

 Triggered by some upstate mayors, there’s an interesting debate emerging locally in the City of Albany. And, it’s pertinent to many more cities than Albany.

It started at a SCAA Forum on inequality (which was quite good, and to which I’ll return, but which is not the main subject of this post). Kathy Sheehan, Mayor of the City of Albany pointed out that the basic structure of local government and local government financing in New York was framed in the post-WW II era, when wealth tended to be concentrated in the cities. Then came cars and suburbanization. Now, instead of wealth, it’s poverty that’s concentrated in New York’s upstate cities. She was joined by Lovely Warren, Mayor of Rochester and Savante Myrick, Mayor of Ithaca. Much of Rochester’s wealth also migrated to the suburbs, although that City also suffered heavily from the fall of Kodak and decline of several other industrial companies. Beyond two substantial universities, tourism, and surrounding wine country, I’m not familiar with the specifics of Ithaca.

The three mayors pointed out that even politically disparate states like California and Texas grant legal authority to cities to annex surrounding land and communities, but New York does not. Though poverty may also be centralized in such communities, the respective city governments are not cut off from a fleeing tax base. 

On Facebook, a local citizen, Julie O’Connor set up a online neighborhood association. Triggered by the SCAA discussion, they’ve picked up the discussion. Much of it has been around the notion of an income tax on commuters. In Albany, this of course, would tag many State employees and that employer is much less likely to move out than might be the case in Rochester, Ithaca or other cities.

I’m not ready to get into the debate itself, but do care about the fiscal health of all local governments. I’ve got the data handy so figured it might be useful to actually publish it. Anyway, here’s some actual data. (Read our slogan, folks.)

It’s a fair point that commuters benefit from many of those functions. Heaven knows I’ve heard the complaints when the snow isn’t plowed fast enough and soon enough. It’s also a fair point that the system dynamics in circumstances like these often cause some serious drain swirling. The worse it gets, the worse it gets.

As you can see, the largest portion of the City’s expenditures go toward public protection, mainly police and fire. That’s followed by general government, debt service (much of which I’d wager was incurred for street and other infrastructure maintenance), transportation (which would include more routine street related functions, such as snow plowing) and then sanitation. After that, it’s small potatoes. 

City of Albany Expenditure by Function 2013

 

 

 

 

 

 

Data source: NYS OSC. Analysis, Public Signals, LLC

These data only reflect the City government and do not include those for elementary and secondary education, which require significant property taxes.

For the record, we live just outside the Albany City boundaries, actually within walking distance. We had moved from the suburbs into what’s called the Mansion Hill neighborhood (around the corner from the Governor’s mansion) some years ago. But right after we moved in to the new place, we got burned out. And needing to quickly find a new place to live, we found a place to rent and haven’t left. 

{ 0 comments }

Dennis Yusko reports in today’s Times Union that the NYS Department of Health is offering to “settle” 5,707 rate appeals with county nursing homes. Some of the appeals go back to 1995.

Specifically, what this refers to are the results, the technical methods, and facility-specific data used in calculations, by which the Department of Health determines the precise payment rate at which New York’s Medicaid program reimburses a county nursing home is subject to appeal. Along with the methods, appeals tend to be pretty technical. 

Having the cash will be nice, of course and my guess is that in most cases, they should take the deal. But before county officials get carried away with the significance of the offer, they should do the most basic arithmetic.

As Yusko points out:

Cash-strapped counties have argued for decades that the Medicaid reimbursement set by the state were far less that what it costs to provide eligible patients the necessary level of care.

Amid millions of dollars in annual red ink, several counties across the state have sold, or are working to sell, their nursing homes to private companies. 

Both statements are true – as far as they go. What Yusko’s story doesn’t say is that the cost of operating county-operated nursing homes is far greater than others in New York. That’s the primary reason they are such money losers.

What do the numbers mean on average and what do they mean for the future? Albany County lost over $6 million last year and their own projection is that they will lose over $2.1 million in 2015. And those figures don’t include everything (e.g., health benefit costs for retired Nursing Home employees). Getting $615,000 will help offset that. But, on average, the settlement numbers annualized amount to a whopping $31,200

This is all about the past. None of it takes into account that New York has changed fundamentally how it structures management and payment for nursing home care of Medicaid clients. Already begun is movement of such clients, even nursing home clients into various forms of managed care. That’s the future. And it won’t be friendly to nursing homes generally, and certainly not to expensive county-operated nursing homes.

Don’t spend it all in one place. And certainly don’t fool yourself into thinking that you can now make a county nursing home in New York a break-even deal.

{ 0 comments }

Is Campaign Finance Data Unusually Dirty Data? At first glance, it sure seems that way.

In an idle moment, just poking around looking at different data files, I decided to load some campaign finance data from New York’s Open Data site. Just go to the site, search on “elections,” pick a file and see what you get. I looked at a couple different files. The analysis below, which is typical, is from the file, “Campaign Finance Expenditures Submitted to the New York State Board of Elections Beginning 1999.” (Note, though I’ve a bunch of questions, I did not call the Board of Elections. I probably will, but I don’t think it necessary before playing with the issues discussed below.) 

Given the issues around campaign finance, should we be at all surprised that the data appear especially dirty? I don’t mean this in a political sense, but in a geek sense.

What a mess:

  • Misspellings and different spellings of the same names
  • Incomplete data
  • Non-existent data
  • Inconsistent date formats
  • Invalid data
And, none of that even asks the question of whether data is accurate.

Here’s an example of an easily avoidable problem: identifying the state in the contributor’s address. It should be pretty easy to get that one right. Right?

Yet, over the fifteen year period, 1999 through 2014, 9.0 percent of the records (over 195,000 of them) did not even list a state. Those records were associated with reported contributions (perhaps, accurate, perhaps valid, but perhaps not) of over $131 million, about 4.7 percent of the reported contributions. And more were clearly invalid. Less than 91 percent of the total records had a valid state identifier. Can you imagine if the Post Office had an error rate like that?

 

Table 1999 2014  Validity of Contributor State IDs  NYS BOE  Public Signals LLC

 

 

 

 

 

Validity of Contributor State IDs  NYS BOE  Public Signals LLC

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here are some fun examples. The listed state code is on the left and the count of the number of times it was used is on the right. Recognize any of them? Though not shown, my favorite state code in this file was “OZ.” I guess that doesn’t mean Kansas, does it Dorothy?  

Examples of Invalid State Codes in BOE Campaign Expenditure Reports  Public Signals LLC

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The date data were at least as messy. My favorite was the submission dated in the year, 200. Yes, only three digits. But there were also submissions from 1899, 1900, and 1901. 
 
The years listed are in a field labeled “election year.” Why are the numbers for 2013 so much greater than 2014? Draw your own conclusions.
 
 I haven’t explored every data set on New York’s Open Data site, but of the files I have looked at, the elections data files are certainly the lowest quality. Why might that be? Well, what are the usual explanations in other domains?
  • No one involved in the process of preparing, submitting, and review (never mind analysis) of the data has a stake in clean data. Indeed, some might even be advantaged by dirty data as it clouds and muddies the what might otherwise be evident.
  • There’s little or no penalty for inadequate data.
  • Campaigns tend to be short-term affairs, especially losing ones. So even if inclined to get it right, there’s no opportunity for improvement. 

 Well, you can add your own theories. I have others. They’re less geeky and much more cynical.

{ 0 comments }

Here’s more detail regarding the last post. It shows percent change in sales tax receipts from NYS Tax and Finance, rank ordered by percent change from SFY 2003 to 2014.

NewImage

{ Comments on this entry are closed }

Variation, Always Variation

by John W Rodat on December 22, 2014

Was running some quick numbers this morning and generated the attached graphic.

By State Fiscal Year, it shows the percent change in sales tax distributions by the New York State Department of Tax and Finance to each county outside New York City. The data are indexed with each year showing the percent change from the first year (as opposed to the prior year). From the receiving end, it shows percent change in county sale and use tax revenue from SFY 2003 to each year after. 

The data simply reflect actual funds and are not adjusted for changes in sales tax rates. As can be seen, the largest percent change came in Oswego County. Though not labeled, the second largest, was Jefferson County. I’ll put an interactive version online later. The lowest was Albany County, which is particularly problematic since Albany is also especially dependent on sales tax revenues. Also unlabeled, the second lowest, was Ulster County.

From start to finish, there’s more than a six-fold difference from the lowest to the highest. That’s always worth exploring.

I’ll put up an interactive version later.

NewImage

 

{ Comments on this entry are closed }

Edward Snowden is the Citizen Four for government misuse of data. Who’s Snowden for corporate use?

The car ride company, Uber has a callous corporate culture. It’s not just annoying. It’s frightening. Read Tufekci and King in the New York Times: We Can’t Trust Uber

{ Comments on this entry are closed }

Ghost of Tom Joad

by John W Rodat on November 12, 2014

Somehow the “Ghost of Tom Joad” feels timely.

Bruce Springsteen and the E Street Band with Tom Morello at Madison Square Garden in NYC, October, 2009.

{ Comments on this entry are closed }